easy way to unzip file

Learn easy way to unzip file with practical examples, diagrams, and best practices. Covers go, zip, unzip development techniques with visual explanations.

Effortless File Extraction: Unzipping Archives with Go

Hero image for easy way to unzip file

Learn how to easily unzip files in Go using the standard library, covering common scenarios and best practices for robust archive handling.

Working with compressed archives is a common task in many applications, from deploying software to processing data. Go's standard library provides excellent support for handling .zip files through its archive/zip package. This article will guide you through the process of unzipping files in Go, demonstrating how to extract contents to a specified directory, handle errors, and ensure proper resource management.

Understanding the archive/zip Package

The archive/zip package in Go offers functionalities to read and write ZIP archives. For unzipping, the primary components you'll interact with are zip.OpenReader to open a ZIP file, and then iterating through its File headers to access individual entries. Each File header provides metadata about the archived file, including its name, size, and modification time, and allows you to open it for reading its content.

flowchart TD
    A[Start Unzip Process] --> B{Open ZIP File Reader}
    B -- Error --> E[Handle Error & Exit]
    B -- Success --> C[Iterate Through ZIP Entries]
    C --> D{Is Entry a Directory?}
    D -- Yes --> F[Create Directory]
    D -- No --> G[Create File Path]
    G --> H{Open Entry for Reading}
    H -- Error --> E
    H -- Success --> I[Create Destination File]
    I -- Error --> E
    I -- Success --> J[Copy Content (Entry to Dest File)]
    J -- Error --> E
    J -- Success --> K[Close Entry Reader & Dest File]
    K --> C
    C -- No More Entries --> L[Close ZIP Reader]
    L --> M[End Unzip Process]

Flowchart of the Unzipping Process in Go

Implementing a Basic Unzip Function

To unzip a file, you typically need to perform the following steps:

  1. Open the ZIP archive: Use zip.OpenReader to get a zip.ReadCloser.
  2. Iterate through files: Loop over ReadCloser.File to get each *zip.File header.
  3. Determine destination: Construct the full path for the extracted file or directory.
  4. Handle directories: If an entry is a directory, create it in the destination.
  5. Extract files: If an entry is a file, open it from the archive, create a new file at the destination, and copy the contents.
  6. Close resources: Ensure all readers and writers are properly closed to prevent resource leaks.
package main

import (
	"archive/zip"
	"fmt"
	"io"
	"os"
	"path/filepath"
	"strings"
)

// Unzip extracts a zip archive to a specified destination.
func Unzip(src, dest string) ([]string, error) {

	var filenames []string

	r, err := zip.OpenReader(src)
	if err != nil {
		return filenames, err
	}
	defer r.Close() // Close the zip reader when the function returns

	for _, f := range r.File {
		fpath := filepath.Join(dest, f.Name)

		// Check for ZipSlip vulnerability (path traversal)
		if !strings.HasPrefix(fpath, filepath.Clean(dest)+string(os.PathSeparator)) {
			return filenames, fmt.Errorf("%s: illegal file path", fpath)
		}

		filenames = append(filenames, fpath)

		if f.FileInfo().IsDir() {
			// Create directory
			os.MkdirAll(fpath, os.ModePerm)
			continue
		}

		// Create all necessary directories for the file
		if err = os.MkdirAll(filepath.Dir(fpath), os.ModePerm); err != nil {
			return filenames, err
		}

		outFile, err := os.OpenFile(fpath, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, f.Mode())
		if err != nil {
			return filenames, err
		}

		rc, err := f.Open()
		if err != nil {
			outFile.Close()
			return filenames, err
		}

		_, err = io.Copy(outFile, rc)

		// Close the file and reader immediately after copying
		outFile.Close()
		rc.Close()

		if err != nil {
			return filenames, err
		}
	}
	return filenames, nil
}

func main() {
	zipFile := "example.zip"
	destDir := "extracted_files"

	// Create a dummy zip file for demonstration
	createDummyZip(zipFile)

	fmt.Printf("Unzipping '%s' to '%s'\n", zipFile, destDir)
	filenames, err := Unzip(zipFile, destDir)
	if err != nil {
		fmt.Println("Error unzipping:", err)
		return
	}

	fmt.Println("Successfully unzipped files:")
	for _, file := range filenames {
		fmt.Println("-", file)
	}

	// Clean up dummy zip and extracted files
	// os.Remove(zipFile)
	// os.RemoveAll(destDir)
}

// createDummyZip creates a simple zip file for testing purposes.
func createDummyZip(zipFile string) {
	newZipFile, err := os.Create(zipFile)
	if err != nil {
		panic(err)
	}
	defer newZipFile.Close()

	zipWriter := zip.NewWriter(newZipFile)
	defer zipWriter.Close()

	// Add a file
	file1, err := zipWriter.Create("file1.txt")
	if err != nil {
		panic(err)
	}
	_, err = file1.Write([]byte("This is the content of file1."))
	if err != nil {
		panic(err)
	}

	// Add a file in a subdirectory
	file2, err := zipWriter.Create("subdir/file2.txt")
	if err != nil {
		panic(err)
	}
	_, err = file2.Write([]byte("This is the content of file2 in a subdirectory."))
	if err != nil {
		panic(err)
	}

	// Add an empty directory
	_, err = zipWriter.Create("empty_dir/")
	if err != nil {
		panic(err)
	}
}

Security Considerations: ZipSlip Vulnerability

A critical security concern when unzipping archives is the "ZipSlip" vulnerability. This occurs when a malicious archive contains file paths like ../../../../etc/passwd, which, if not handled correctly, could allow an attacker to write files outside the intended destination directory. The provided Unzip function includes a check to prevent this by ensuring that the resolved file path always remains within the dest directory using strings.HasPrefix and filepath.Clean.

Running the Example

To test the Unzip function, save the code above as unzip.go. The main function includes a createDummyZip helper that generates an example.zip file with a few entries. When you run the program, it will create the example.zip and then extract its contents into a new directory named extracted_files.

1. Save the Code

Save the provided Go code into a file named unzip.go.

2. Run the Program

Open your terminal or command prompt, navigate to the directory where you saved unzip.go, and run the command: go run unzip.go.

3. Verify Extraction

After execution, you should see a new directory named extracted_files in the same location. Inside, you'll find file1.txt, a subdir directory containing file2.txt, and an empty_dir.