· 1 分钟阅读时长 · 499 字 · -阅读 -评论

title: “Parsing ZIP File Uploads” tags:

  • JavaScript slug: 8d1e37c8 date: 2021-07-11 15:22:51 summary: “How to handle ZIP file uploads: reading ZIPs on the frontend, writing on the backend, and practical parsing tips.”

I recently needed to upload ZIP files: read on the frontend, send to the backend, write to a ZIP file, then extract. During integration, errors indicated the ZIP was invalid.

From the error, the written byte stream was clearly wrong.

Current code

查了下当前的读取ZIP文件的代码如下

Frontend

   const reader = new FileReader();
      reader.readAsText(file, 'UTF-8'); 
      reader.onload = (evt) => {
        setUploadContent(Base64.encode(evt.target.result));
      };

Backend

而后端go语言的写入代码大致如下

	decContractsSource, err := base64.StdEncoding.DecodeString(h.Req.ContractsSourceBase64)
	if err != nil {
		msg := fmt.Sprintf("base64 DecodeString error %v", err)
		seelog.Errorf(msg)
		h.SetBaseResponse(apiCommon.ErrCodeInternalError, msg)
		return
	}
	content := []byte(decContractsSource)
	err = ioutil.WriteFile("hellooworld.zip", content, 0644)
	if err != nil {
		msg := fmt.Sprintf("WriteFile error, AppId{%v} %v", h.Req.AppId, err)
		seelog.Errorf(msg)
		h.SetBaseResponse(apiCommon.ErrCodeInternalError, msg)
		return
	}

The backend simply decodes base64 and writes bytes to a ZIP file.

So the likely issue is on the frontend read side.

About ZIP

以下摘自WIKI

ZIP is an archive file format that supports lossless data compression.

In JS, reader.readAsText is for text files. ZIP is a binary archive; reading as text will corrupt bytes. That’s the bug.

FileReader APIs

  • FileReader.readAsDataURL()

    • 文件读写,且会进行base64编码
  • FileReader.readAsText()

    • 文本文件读写
  • FileReader.readAsArrayBuffer()

    • 二进制数组
  • FileReader.readAsBinaryString()非标准API,已废除

For ZIPs, do not use readAsText. Use readAsDataURL instead, and strip the MIME prefix data:application/zip;base64,.

最终改写后,测试OK。

Demo

  • 为了验证这个问题,这里做了个小Demo,感兴趣的可以看看。

  • FileReader.readAsDataURL() works for both text and binary (e.g., ZIP) and can fully replace readAsText here.

这里贴下关键代码块

const reader = new FileReader();
reader.readAsDataURL(file); 
···
fileContent = evt.target.result.replace(/^(data:[a-z-\/]+;base64,)/, '')
···

···
// 后端
const buff = new Buffer(req.body.file, 'base64');
fs.writeFileSync(`./test.${req.body.fileType === 'zip' ? 'zip' : 'txt'}`, buff);
···

Other gotchas

I limited uploads to .zip, but on macOS Chrome, .xlsx was still accepted — because .xlsx itself is a ZIP container. Tighten client‑side checks accordingly.

Final Thoughts

  • Compression reduces A bytes to B bytes via algorithms; plain text isn’t compressed. Reading binary as text shortens/corrupts content, hence the invalid ZIP.
Alan H
Authors
开发者,数码产品爱好者,喜欢折腾,喜欢分享,喜欢开源