Please purchase the course to watch this video.

Full Course
Hashing provides a powerful way to verify the integrity and authenticity of files and data by generating unique digital fingerprints, known as checksums, using algorithms like MD5, SHA-256, and Blake2b. Even a minor change in a file's content results in a completely different hash, making this technique vital for detecting corruption, unauthorized modifications, or tampering—especially when downloading critical files such as operating system images. In Go, hashing is easily implemented using standard and extended libraries, allowing seamless integration with file handling workflows. This approach not only secures data but also streamlines tasks like file identification and validation in applications, ensuring reliability and safeguarding against security risks commonly faced in software development and system administration.
When it comes to working with both files and data, eventually you'll want to be able to check a file to ensure it hasn't been modified, corrupted, or even tampered with. This is where a concept known as hashing comes into play, which allows you to generate a unique fingerprint of a file's contents.
What is Hashing?
The idea of hashing is that it makes use of hash algorithms, which will produce a sort of fingerprint given the content's data.
Simple Example
Let's create a test file and see hashing in action:
# Create a test file
echo "1 2 3 4 5" > words.txt
# Generate MD5 hash
md5sum words.txt
# Output: a1b2c3d4e5f6... (hexadecimal string)
Now if we modify the file:
# Remove one character
echo "1 2 3 4" > words.txt
# Generate hash again
md5sum words.txt
# Output: different hash! (shows file changed)
This makes hashing a really useful mechanism for:
- Comparing two files together
- Detecting changes within a file
- Verifying file integrity when downloading files
- Ensuring critical data remains unchanged over time
Real-World Use Cases
1. File Integrity Verification
When downloading files from websites, you can verify the download didn't get corrupted during transfer.
2. Content Management
I actually use hashing in the Dreams of Code website, where I hash raw video files inside my CLI application to generate unique IDs. This ensures the ID remains consistent for each video file, no matter where I'm running the code (production vs development).
3. Security Applications
- Password storage: Store hashed versions instead of plain text
- Data validation: Ensure downloaded software hasn't been tampered with
4. Operating System Security
Linux distributions like Arch Linux provide checksums for their ISOs:
# Arch Linux provides both SHA256 and BLAKE2b checksums
sha256sum archlinux-x86_64.iso
blake2b archlinux-x86_64.iso
This ensures you're running legitimate code with high privileges, not a compromised image from poisoned mirrors.
Building a Hash Utility in Go
Let's create a CLI application to hash files and verify their integrity.
Project Setup
# Initialize the project
mkdir hasher
cd hasher
go mod init hasher
Create the main structure:
package main
import (
"crypto/sha256"
"encoding/hex"
"fmt"
"io"
"log"
"os"
)
func main() {
// We'll implement this step by step
}
Step 1: Basic File Hashing
func main() {
// Check command line arguments
if len(os.Args) < 2 {
log.Fatal("must provide file to hash")
}
filename := os.Args[1]
// Open the file
file, err := os.Open(filename)
if err != nil {
log.Fatal("error opening file:", err)
}
defer file.Close()
// Create SHA256 hasher
hasher := sha256.New()
// Copy file contents to hasher
// hasher implements io.Writer interface
_, err = io.Copy(hasher, file)
if err != nil {
log.Fatal("error reading file:", err)
}
// Generate the hash sum
sum := hasher.Sum(nil)
// Convert to hexadecimal string
hexSum := hex.EncodeToString(sum)
fmt.Println(hexSum)
}
Understanding the Code
The Sum()
Method
sum := hasher.Sum(nil)
The Sum()
method appends the current hash to the byte slice passed in. Since we pass nil
, we get just the hash. You could do:
sum := hasher.Sum([]byte("prefix"))
// Result: "prefix" + hash_bytes
Why Hexadecimal?
The hash is returned as bytes, but checksums are typically displayed as hexadecimal strings for readability:
// Raw bytes: [21, 129, 181, ...]
// Hex string: "1581b5..."
Testing the Implementation
go build
./hasher lotsofwords.txt
# Output: long hexadecimal string
# Compare with system utility
sha256sum lotsofwords.txt
# Should match our output!
Step 2: Adding Hash Verification
Let's extend our program to verify hashes:
func main() {
if len(os.Args) < 2 {
log.Fatal("must provide file to hash")
}
filename := os.Args[1]
// Open and hash the file (same as before)
file, err := os.Open(filename)
if err != nil {
log.Fatal("error opening file:", err)
}
defer file.Close()
hasher := sha256.New()
io.Copy(hasher, file)
sum := hasher.Sum(nil)
hexSum := hex.EncodeToString(sum)
// If expected hash provided, verify it
if len(os.Args) >= 3 {
expected := os.Args[2]
if hexSum == expected {
fmt.Println("✅ Hash verification: PASSED")
} else {
fmt.Println("❌ Hash verification: FAILED")
fmt.Printf("Expected: %s\n", expected)
fmt.Printf("Got: %s\n", hexSum)
}
} else {
// Just print the hash
fmt.Println(hexSum)
}
}
Testing with Real Files
Download an Arch Linux ISO and verify it:
# Download Arch Linux ISO (or any file with published checksums)
wget https://mirror.example.com/archlinux-x86_64.iso
# Get the published SHA256 checksum from archlinux.org
# Let's say it's: 815b4c8b7c9d2e5f...
# Verify using our tool
./hasher archlinux-x86_64.iso 815b4c8b7c9d2e5f...
# Output: ✅ Hash verification: PASSED
Working with BLAKE2b
The Go standard library extensions provide additional hash algorithms. Let's add BLAKE2b support:
Installing the Extension
go get golang.org/x/crypto/blake2b
Adding BLAKE2b Support
package main
import (
"crypto/sha256"
"encoding/hex"
"fmt"
"io"
"log"
"os"
"golang.org/x/crypto/blake2b"
)
func main() {
if len(os.Args) < 2 {
log.Fatal("must provide file to hash")
}
filename := os.Args[1]
file, err := os.Open(filename)
if err != nil {
log.Fatal("error opening file:", err)
}
defer file.Close()
// Use BLAKE2b instead of SHA256
hasher, err := blake2b.New512(nil) // 512-bit, no key
if err != nil {
log.Fatal("error creating hasher:", err)
}
io.Copy(hasher, file)
sum := hasher.Sum(nil)
hexSum := hex.EncodeToString(sum)
fmt.Println(hexSum)
}
BLAKE2b Parameters
blake2b.New512(key)
- Size: 512 bits (64 bytes)
- Key: Used for HMAC (Hash-based Message Authentication Code)
- Pass
nil
for simple hashing - Pass a key for signed/authenticated hashing
- Pass
Key Concepts
Hash Algorithms as io.Writer
All Go hash algorithms implement the io.Writer
interface:
// This means you can use them anywhere io.Writer is expected
io.Copy(hasher, file) // Copy file to hasher
hasher.Write([]byte("data")) // Write data directly
This design makes Go's I/O operations incredibly flexible and composable.
Available Hash Algorithms
The crypto
package provides access to many algorithms:
- SHA256:
crypto/sha256
- SHA512:
crypto/sha512
- MD5:
crypto/md5
(deprecated for security) - BLAKE2b:
golang.org/x/crypto/blake2b
- BLAKE2s:
golang.org/x/crypto/blake2s
Complete Example
Here's a more robust version with better error handling:
package main
import (
"crypto/sha256"
"encoding/hex"
"fmt"
"io"
"log"
"os"
"golang.org/x/crypto/blake2b"
)
func hashFile(filename string, useBlake2b bool) (string, error) {
file, err := os.Open(filename)
if err != nil {
return "", fmt.Errorf("opening file: %w", err)
}
defer file.Close()
var hasher io.Writer
if useBlake2b {
h, err := blake2b.New512(nil)
if err != nil {
return "", fmt.Errorf("creating BLAKE2b hasher: %w", err)
}
hasher = h
} else {
hasher = sha256.New()
}
if _, err := io.Copy(hasher, file); err != nil {
return "", fmt.Errorf("reading file: %w", err)
}
// Type assertion to get Sum method
var sum []byte
if useBlake2b {
sum = hasher.(*blake2b.Digest).Sum(nil)
} else {
sum = hasher.(*sha256.Digest).Sum(nil)
}
return hex.EncodeToString(sum), nil
}
func main() {
if len(os.Args) < 2 {
log.Fatal("Usage: hasher <file> [expected_hash]")
}
filename := os.Args[1]
// Hash the file (using SHA256 by default)
hash, err := hashFile(filename, false)
if err != nil {
log.Fatal("Error hashing file:", err)
}
// Verify if expected hash provided
if len(os.Args) >= 3 {
expected := os.Args[2]
if hash == expected {
fmt.Println("✅ Hash verification: PASSED")
} else {
fmt.Println("❌ Hash verification: FAILED")
fmt.Printf("Expected: %s\n", expected)
fmt.Printf("Got: %s\n", hash)
os.Exit(1)
}
} else {
fmt.Println(hash)
}
}
Summary
We've successfully created a file hashing utility that can:
✅ Generate hashes using different algorithms
✅ Verify file integrity against known checksums
✅ Work with large files efficiently using streaming I/O
✅ Support multiple algorithms (SHA256, BLAKE2b)
The hash algorithms in Go's standard library are well-designed and conform to the io.Writer
interface, making them easy to use with Go's excellent I/O support.
What's Next?
In the next lesson, we're going to look at how we can hash passwords and accept them securely from the user, diving deeper into secure input handling.
📝 Homework Assignment
Extend the hasher application with these features:
Part 1: Algorithm Selection Flag
Add a flag to choose between different hashing algorithms:
./hasher -algorithm sha256 file.txt
./hasher -algorithm blake2b file.txt
./hasher -algorithm md5 file.txt # if you want to add MD5
Implementation hints:
- Use the
flag
package to add an-algorithm
flag - Create a switch statement to choose the appropriate hasher
- Return an error for unsupported algorithms
Part 2: Verification Flag
Instead of using positional arguments, add a -verify
or -checksum
flag:
# Generate hash
./hasher file.txt
# Output: abc123def456...
# Verify hash
./hasher -checksum abc123def456... file.txt
# Output: ✅ Hash verification: PASSED
Implementation hints:
- Add a string flag for the expected checksum
- Only perform verification when the flag is provided
- Make the interface more user-friendly
Part 3: Support Multiple Algorithms
Add support for additional algorithms from the crypto
package:
crypto/sha1
crypto/sha512
crypto/md5
Example Final Usage
# Generate SHA256 hash
./hasher -algorithm sha256 file.txt
# Verify BLAKE2b hash
./hasher -algorithm blake2b -checksum expected_hash file.txt
# List supported algorithms
./hasher -help
Once you've implemented these features, you'll have a robust file hashing utility that rivals system tools like sha256sum
and md5sum
!
Algorithm Selection Flag
Add a flag to choose between different hashing algorithms instead of hardcoding SHA256 as the default algorithm.
Requirements:
- Use the
flag
package to add an-algorithm
flag - Support at least
sha256
andblake2b
algorithms - Create a switch statement to choose the appropriate hasher
- Return an error for unsupported algorithms
- Set SHA256 as the default if no algorithm is specified
Example Usage:
./hasher -algorithm sha256 file.txt
./hasher -algorithm blake2b file.txt
./hasher -algorithm md5 file.txt # if you want to add MD5
Implementation Notes:
- Import the required crypto packages for each algorithm
- Handle the case where BLAKE2b returns an error during creation
- Provide helpful error messages for unsupported algorithms
Verification Flag
Replace positional arguments with a more user-friendly flag-based interface for hash verification.
Requirements:
- Add a
-checksum
or-verify
flag that accepts the expected hash - Only perform verification when the flag is provided
- Maintain backward compatibility with hash generation when no checksum is provided
- Improve user interface with clear success/failure messages
Example Usage:
# Generate hash
./hasher file.txt
# Output: abc123def456...
# Verify hash
./hasher -checksum abc123def456... file.txt
# Output: ✅ Hash verification: PASSED
# Or using -verify flag
./hasher -verify abc123def456... file.txt
# Output: ✅ Hash verification: PASSED
Implementation Notes:
- Use
flag.String()
to define the checksum flag - Check if the flag value is empty to determine verification mode
- Ensure the program exits with appropriate error codes (0 for success, 1 for failure)
Support Multiple Algorithms
Extend the application to support additional hash algorithms from Go's crypto
package.
Required Algorithms:
crypto/sha1
- SHA-1 (legacy, but still used)crypto/sha512
- SHA-512 (stronger than SHA-256)crypto/md5
- MD5 (deprecated but requested)
Example Usage:
./hasher -algorithm sha1 file.txt
./hasher -algorithm sha512 file.txt
./hasher -algorithm md5 file.txt
Implementation Notes:
- Import the additional crypto packages
- Extend your switch statement to handle all algorithms
- Consider adding a warning for deprecated algorithms like MD5
- Ensure consistent hex encoding for all algorithms
Enhanced User Experience (Bonus)
Add additional features to make the tool more professional and user-friendly.
Optional Enhancements:
- Help flag: Add
-help
flag that lists all supported algorithms - Version flag: Add
-version
flag - Multiple files: Support hashing multiple files at once
- Output format: Add flags for different output formats (plain, JSON, etc.)
- Progress indication: Show progress for large files
Example Usage:
# Show help
./hasher -help
# Output: Usage: hasher [flags] file...
# Supported algorithms: sha1, sha256, sha512, blake2b, md5
# Hash multiple files
./hasher -algorithm sha256 file1.txt file2.txt file3.txt
# JSON output
./hasher -algorithm sha256 -format json file.txt
# Output: {"algorithm": "sha256", "file": "file.txt", "hash": "abc123..."}
Final Expected Interface
After completing all parts, your hasher tool should work like this:
# Basic hash generation
./hasher file.txt
# Output: abc123def456... (SHA256 by default)
# Choose algorithm
./hasher -algorithm blake2b file.txt
# Output: def456abc123... (BLAKE2b hash)
# Verify hash
./hasher -algorithm sha256 -checksum abc123def456... file.txt
# Output: ✅ Hash verification: PASSED
# Show help
./hasher -help
# Output: Usage information and supported algorithms
# Error handling
./hasher -algorithm invalid file.txt
# Output: Error: unsupported algorithm 'invalid'
# Supported algorithms: sha1, sha256, sha512, blake2b, md5
Implementation Tips
- Error Handling: Use proper error handling and meaningful error messages
- Code Organization: Consider organizing hash creation into a factory function
- Type Assertions: Be careful with type assertions when working with different hashers
- Testing: Test with both small and large files to ensure your implementation works correctly
- Documentation: Add comments explaining the purpose of each flag and function
This homework will give you a robust file hashing utility that rivals system tools like sha256sum
and md5sum
!