Domenico Luciani

Create a DNS Resolver with Golang

2024-05-07T08:00:00+00:00

Following the previous post about creating an Application Layer Load Balancer, today I’m going to share my journey about another challenge I took and completed: Create a simple DNS Resolver with Go, let’s go! 🚀

DNS Resolver what?

A DNS Resolver is a crucial component that allows you to resolve an IP address from a certain domain.
For instance, it allows your browser to know where to find the server associated with a specific domain.
(i.e. domenicoluciani.com → 172.67.144.42)

The Coding Challenge

The coding challenge consists of building a simple DNS Resolver that is capable of resolving an IP address from a certain domain. I’d like to highlight about the simple part.
You can find the challenge here: https://codingchallenges.fyi/challenges/challenge-dns-resolver/

Preface

As I did in the previous posts, I took this challenge just for fun and dive deeper into how a DNS resolver works. It’s a weekend project that obviously can contain errors, so if you find one -or more-, please let me know, never stop learning, right? 📚

Things I learned with this challenge

Obviously, I learned A LOT about how a DNS Resolver works.
How the name resolution works
What encoding is used
Went deeper into binary protocols and how they work
- And how to fill a structure with binary data in Go.
The DNS RFC is super clear (well done authors!)
Testing and ChatGPT saved me from a lot of debugging time

Are you interested in one of these things? Then keep reading! 🕵🏻‍♂️

Step 0

For this challenge, I decided to use Go and I tried to use a Test Driven Development approach as usual, even tho not completely since my goal wasn’t to apply it perfectly but to have a good simple design. 🙏🏻

Step 1

This step is about creating a query message that we have to send to the name server, composed of these fields:

A header.
A question section.
An answer section.
An authority section.
An additional section.

The header is always present, and it is composed in this way

                                    1  1  1  1  1  1
      0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                      ID                       |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    QDCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    ANCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    NSCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    ARCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Query ID
Some flags (at the beginning for the challenge we set this flags to 1 and then to 0 because at the beginning we contact a dns resolver to then switch to an authoritative nameserver)
QDCOUNT = Number of questions
ANCOUNT = Number of answers
NSCOUNT = Number of authorities
ARCOUNT = Number of additional

You can see the detail in the RFC, section 4.1.1.

Question

The question section is composed in this way:

                                    1  1  1  1  1  1
      0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                                               |
    /                     QNAME                     /
    /                                               /
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                     QTYPE                     |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                     QCLASS                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

QNAME = encoded name of the domain (i.e. 3dns6google3com)
QTYPE = the type of the query (i.e. A, MX, etc.)
QCLASS = class type (i.e. internet)

The details are defined in these sections of the RFC:

Query

Both sections need to be encoded in bytes and put together in order to form the final query.

When we send the request we don’t don’t need to compose authorities and additionals, they are going to be filled out in the response.

Let’s see the code 👀

Let’s take a look at how I created and converted these two structures into bytes.

The test is quite minimal and simple:

    t.Run("Should encode an header into bytes", func(t *testing.T) {
        header := NewHeader(22, RECURSION_FLAG, 1, 0, 0, 0)

        encodedHeader := header.ToBytes()

        expected, err := hex.DecodeString("0016010000010000000000000")
        assert.NotNil(t, err)
        assert.Equal(t, expected, encodedHeader)
    })

const RECURSION_FLAG uint16 = 1 << 8

type Header struct {
    Id      uint16
    Flags   uint16
    QdCount uint16
    AnCount uint16
    NsCount uint16
    ArCount uint16
}

and then I implemented the function for the conversion:

func (h *Header) ToBytes() []byte {
    encodedHeader := new(bytes.Buffer)
    binary.Write(encodedHeader, binary.BigEndian, h.Id)
    binary.Write(encodedHeader, binary.BigEndian, h.Flags)
    binary.Write(encodedHeader, binary.BigEndian, h.QdCount)
    binary.Write(encodedHeader, binary.BigEndian, h.AnCount)
    binary.Write(encodedHeader, binary.BigEndian, h.NsCount)
    binary.Write(encodedHeader, binary.BigEndian, h.ArCount)

    return encodedHeader.Bytes()
}

The go encoding/binary package was crucial to working with anything related to the binary encoding.
Basically, I just append into encodedHeader bytes.Buffer whatever I have on each field of the struct using a Big-endian order.

I have done the same thing for the question part:

    t.Run("Should encode a question into bytes", func(t *testing.T) {
        question := NewQuestion("dns.google.com", TYPE_A, CLASS_IN)

        encodedQuestion := question.ToBytes()

        expected, _ := hex.DecodeString("03646e7306676f6f676c6503636f6d0000010001")
        assert.NotNil(t, expected)
        assert.Equal(t, expected, encodedQuestion)
    })

and the implementation

func (q *Question) ToBytes() []byte {
    encodedQuestion := new(bytes.Buffer)
    binary.Write(encodedQuestion, binary.BigEndian, q.QName)
    binary.Write(encodedQuestion, binary.BigEndian, q.QType)
    binary.Write(encodedQuestion, binary.BigEndian, q.QClass)

    return encodedQuestion.Bytes()
}

One thing here is that we have to encode the domain name using a simple encoding algorithm, here the test:

    t.Run("Should encode the dns name", func(t *testing.T) {
        encodedDnsName := encodeDnsName([]byte("dns.google.com"))
        assert.Equal(t, []byte("\x03dns\x06google\x03com\x00"), encodedDnsName)
    })

Basically we replace each dot with a number of characters we have right after it, so dns.google.com becomes 3dns6google3com.

The implementation:

func encodeDnsName(qname []byte) []byte {
    var encoded []byte
    parts := bytes.Split([]byte(qname), []byte{'.'})
    for _, part := range parts {
        encoded = append(encoded, byte(len(part)))
        encoded = append(encoded, part...)
    }
    return append(encoded, 0x00)
}

and now let’s join both header and question together, the test:

    t.Run("Should create a query", func(t *testing.T) {
        header := NewHeader(22, RECURSION_FLAG, 1, 0, 0, 0)
        question := NewQuestion("dns.google.com", TYPE_A, CLASS_IN)

        query := NewQuery(header, question)

        expected, err := hex.DecodeString("00160100000100000000000003646e7306676f6f676c6503636f6d0000010001")
        assert.Nil(t, err)
        assert.Equal(t, expected, query)
    })

and the resulting code:

func NewQuery(header *Header, question *Question) []byte {
    var query []byte

    query = append(query, header.ToBytes()...)
    query = append(query, question.ToBytes()...)

    return query
}

Step 2

Now we need to send our query over the network using the UDP protocol and get back the response from the name server.
The code is quite simple, I didn’t want to spend much time on error checking tho, here is the implementation:

type Client struct {
    serverAddress string
    port          int
}

func NewClient(address string, port int) *Client {
    return &Client{serverAddress: address, port: port}
}

func (c *Client) SendQuery(query []byte) []byte {
    conn, err := net.Dial("udp", fmt.Sprintf("%s:%d", c.serverAddress, c.port))
    if err != nil {
        fmt.Printf("Dial err %v\n", err)
        os.Exit(-1)
    }
    defer conn.Close()

    if _, err = conn.Write(query); err != nil {
        fmt.Printf("Write err %v\n", err)
        os.Exit(-1)
    }

    response := make([]byte, 1024)
    lengthOfTheResponse, err := conn.Read(response)
    if err != nil {
        fmt.Printf("Read err %v\n", err)
        os.Exit(-1)
    }

    if !hasTheSameID(query, response) {
        fmt.Printf("Response doesn't have the same ID of the query q:%v, r:%v\n", query, response)
        os.Exit(-1)
    }

    return response[:lengthOfTheResponse]
}

One check I had to implement due of requirements is about the query ID (it can be whatever), the one we send should be the same as the one that we receive from the server, here the test:

    t.Run("Should check if the response starts with the same ID as the query", func(t *testing.T) {
        query, _ := hex.DecodeString("00160100000100000000000003646e7306676f6f676c6503636f6d0000010001")
        response, _ := hex.DecodeString("00168080000100020000000003646e7306676f6f676c6503636f6d0000010001c00c0001000100000214000408080808c00c0001000100000214000408080404")

        assert.True(t, hasTheSameID(query, response))
    })

and the implementation:

func hasTheSameID(query, response []byte) bool {
    return slices.Equal(query[:2], response[:2])
}

Step 3

Now our goal is to parse the response, the message luckily has the same structure as the one we sent, let’s see how to parse the header:

    t.Run("Should create an header from a response", func(t *testing.T) {
        response, _ := hex.DecodeString("001680800001000200000000")
        header, _ := ParseHeader(bytes.NewReader(response))

        assert.Equal(t, &Header{
            Id:      0x16,
            Flags:   1<<15 | 1<<7, // QR (Response) bit = 1, OPCODE = 0 (standard query), AA = 1, TC = 0, RD (Recursion Desired) bit = 1, RA = 1, Z = 0, RCODE = 0
            QdCount: 0x1,
            AnCount: 0x2,
            NsCount: 0x0,
            ArCount: 0x0,
        }, header)
    })

and the implementation:

func ParseHeader(reader *bytes.Reader) (*Header, error) {
    var header Header

    binary.Read(reader, binary.BigEndian, &header.Id)
    binary.Read(reader, binary.BigEndian, &header.Flags)
    switch header.Flags & 0b1111 {
    case 1:
        return nil, errors.New("error with the query")
    case 2:
        return nil, errors.New("error with the server")
    case 3:
        return nil, errors.New("the domain doesn't exist")
    }
    binary.Read(reader, binary.BigEndian, &header.QdCount)
    binary.Read(reader, binary.BigEndian, &header.AnCount)
    binary.Read(reader, binary.BigEndian, &header.NsCount)
    binary.Read(reader, binary.BigEndian, &header.ArCount)

    return &header, nil
}

Thanks to the encoding/binary package we can easily get the information we need and store it into our structure.
Here I also implemented a check to verify that the response didn’t have errors. Thanks to ChatGPT I could easily generate the binary response for each use-case otherwise I would have had to do it by myself 😫

The tests are more or less like this:

    t.Run("Should return an error if the header flags contains a query error", func(t *testing.T) {
        response, _ := hex.DecodeString("001680810001000200000000")

        header, err := ParseHeader(bytes.NewReader(response))

        assert.Nil(t, header)
        assert.NotNil(t, err)
        assert.EqualError(t, err, "error with the query")
    })

Now let’s parse the rest of the message:

func ParseQuestion(reader *bytes.Reader) *Question {
    var question Question

    question.QName = []byte(DecodeName(reader))
    binary.Read(reader, binary.BigEndian, &question.QType)
    binary.Read(reader, binary.BigEndian, &question.QClass)

    return &question
}

Here we see a DecodeName function, which is the most difficult part in terms of implementation for the DNS Resolver:

func DecodeName(reader *bytes.Reader) string {
    var name bytes.Buffer

    for {
        lengthByte, _ := reader.ReadByte()

        if (lengthByte & 0xC0) == 0xC0 {
            name.WriteString(getBackTheDomainFromTheHeader(reader, lengthByte))
            break
        }

        if lengthByte == 0 {
            break
        }

        label := make([]byte, lengthByte)
        io.ReadFull(reader, label)
        name.Write(label)
        name.WriteByte('.')

    }

    result, _ := strings.CutSuffix(name.String(), ".")
    return result
}

func getBackTheDomainFromTheHeader(reader *bytes.Reader, lengthByte byte) string {
    nextByte, _ := reader.ReadByte()
    pointer := uint16((uint16(lengthByte) & 0x3F) | uint16(nextByte))

    currentPos, _ := reader.Seek(0, io.SeekCurrent)

    reader.Seek(int64(pointer), io.SeekStart)

    decodedName := DecodeName(reader)

    reader.Seek(currentPos, io.SeekStart)

    return decodedName
}

I created a recursive function for simplicity; basically if the buffer starts with 0xC0 it means we are in front of a “DNS compression algorithm”.

The algorithm consists of a pointer towards the domain name we previously got in the buffer in order to not being repeated and save space. So, we calculate the offset, move there, read the domain name, and then get back to the original position, continuing with the parsing.

Of course this is a very basic algorithm and it can lead to multiple problems (like for example a malicious server can create a pointer to itself creating an infinite loop but you know, it was out of the scope of this challenge 😇)

And last but not the least, let’s parse the records we got:

func TestResponse(t *testing.T) {
    t.Run("Should create a record from a response", func(t *testing.T) {
        response, _ := hex.DecodeString("00168080000100020000000003646e7306676f6f676c6503636f6d0000010001c00c0001000100000214000408080808c00c0001000100000214000408080404")
        reader := bytes.NewReader(response)
        const RECORD_STARTING_POINT = 32
        skipResponseTill(t, reader, response, RECORD_STARTING_POINT)

        record := ParseRecord(reader)

        assert.NotEmpty(t, record)
        assert.Equal(t, TYPE_A, record.Type)
        assert.Equal(t, CLASS_IN, record.Class)
        assert.Greater(t, record.TTL, uint32(0))
        assert.Greater(t, record.RdLength, uint16(0))
        assert.Equal(t, "8.8.8.8", record.Rdata)

        record = ParseRecord(reader)

        assert.NotEmpty(t, record)
        assert.Equal(t, TYPE_A, record.Type)
        assert.Equal(t, CLASS_IN, record.Class)
        assert.Greater(t, record.TTL, uint32(0))
        assert.Greater(t, record.RdLength, uint16(0))
        assert.Equal(t, "8.8.4.4", record.Rdata)
    })
}

func skipResponseTill(t *testing.T, reader *bytes.Reader, response []byte, startingPoint int64) {
    t.Helper()
    reader.ReadAt(response, startingPoint)
}

Here I could have added more tests and use-cases I know, I’ll leave it to you as a homework. 😎

The implementation:

type Record struct {
    Name     []byte
    Type     uint16
    Class    uint16
    TTL      uint32
    RdLength uint16
    Rdata    string
}

func ParseRecord(reader *bytes.Reader) *Record {
    var record Record
    record.Name = []byte(DecodeName(reader))
    binary.Read(reader, binary.BigEndian, &record.Type)
    binary.Read(reader, binary.BigEndian, &record.Class)
    binary.Read(reader, binary.BigEndian, &record.TTL)
    binary.Read(reader, binary.BigEndian, &record.RdLength)
    switch record.Type {
    case TYPE_A:
        record.Rdata = readIP(reader, record.RdLength)
    case TYPE_NS:
        record.Rdata = DecodeName(reader)
    default:
        record.Rdata = string(readData(reader, record.RdLength))
    }
    return &record
}

func readIP(reader *bytes.Reader, length uint16) string {
    dataBytes := readData(reader, length)
    return fmt.Sprintf("%d.%d.%d.%d", dataBytes[0], dataBytes[1], dataBytes[2], dataBytes[3])
}

func readData(reader *bytes.Reader, length uint16) []byte {
    dataBytes := make([]byte, length)
    binary.Read(reader, binary.BigEndian, &dataBytes)
    return dataBytes
}

Here we can see how we differentiate between TYPE_A and TYPE_NS in order to be able to decode the domain correctly. This is important because with the first type we get an IP, with the second a domain name.

The record part is the most important one because it might be:

ANSWER: A list of IP addresses, basically what we are looking for
AUTHORITIES: A list of NS servers that potentially can have what we are looking for
ADDITIONALS: A list of IP addresses of the NS servers we got from the AUTHORITITES section.

Let’s put everything together

Now it’s time to use all these function together:

func resolve(domainName string, questionType uint16) string {
	nameServer := "198.41.0.4"
	for {
		fmt.Printf("Querying %s for %s\n", nameServer, domainName)
		dnsResponse := sendQuery(nameServer, domainName, questionType)
		dnsPacket := getDnsPacketFromResponse(dnsResponse)

		if ip := getAnswer(dnsPacket.answers); ip != "" {
			return ip
		}

		if nsIp := getNameServerIp(dnsPacket.additionals); nsIp != "" {
			nameServer = nsIp
			continue
		}

		if nsDomain := getNameServer(dnsPacket.authorities); nsDomain != "" {
			nameServer = resolve(nsDomain, packet.TYPE_A)
		}
	}
}

Where SendQuery:

func sendQuery(nameServer, domainName string, questionType uint16) []byte {
	query := packet.NewQuery(
		packet.NewHeader(22, 0, 1, 0, 0, 0),
		packet.NewQuestion(domainName, questionType, packet.CLASS_IN),
	)

	client := network.NewClient(nameServer, 53)
	return client.SendQuery(query)
}

Creates the query from the header and the question and then send the query to the nameserver.

Then we get the DNSPacket from the response parsing it:

func getDnsPacketFromResponse(dnsResponse []byte) *DNSPacket {
	var (
		header      *packet.Header
		questions   []*packet.Question
		answers     []*packet.Record
		authorities []*packet.Record
		additionals []*packet.Record
	)

	reader := bytes.NewReader(dnsResponse)
	header, err := packet.ParseHeader(reader)
	if err != nil {
		fmt.Printf("Can't parse the response header: %v\n", err)
		os.Exit(-1)
	}
	for range header.QdCount {
		questions = append(questions, packet.ParseQuestion(reader))
	}

	for range header.AnCount {
		answers = append(answers, packet.ParseRecord(reader))
	}

	for range header.NsCount {
		authorities = append(authorities, packet.ParseRecord(reader))
	}

	for range header.ArCount {
		additionals = append(additionals, packet.ParseRecord(reader))
	}

	return &DNSPacket{
		header:      header,
		questions:   questions,
		answers:     answers,
		authorities: authorities,
		additionals: additionals,
	}
}

and at the end we check what results we get from the other sections:

func getAnswer(answers []*packet.Record) string {
	return getRecord(answers)
}

func getNameServerIp(additionals []*packet.Record) string {
	return getRecord(additionals)
}

func getNameServer(authorities []*packet.Record) string {
	return getRecord(authorities)
}

func getRecord(records []*packet.Record) string {
	for _, record := range records {
		if record.Type == packet.TYPE_A || record.Type == packet.TYPE_NS {
			return record.Rdata
		}
	}
	return ""
}

The code and the output?

As always you can find the code on my Github, at this url: https://github.com/dlion/unnije.

And the output looks like this:

dlion@darkness> unnije % ./unnije domenicoluciani.com
Querying 198.41.0.4 for domenicoluciani.com
Querying 192.41.162.30 for domenicoluciani.com
Querying 108.162.192.65 for domenicoluciani.com
104.21.47.30

dlion@darkness unnije % ./unnije domenicoluciani.com twitter.com
Querying 198.41.0.4 for domenicoluciani.com
Querying 192.41.162.30 for domenicoluciani.com
Querying 108.162.192.65 for domenicoluciani.com
172.67.144.42
Querying 198.41.0.4 for twitter.com
Querying 192.41.162.30 for twitter.com
Querying 198.41.0.4 for a.r06.twtrdns.net
Querying 192.55.83.30 for a.r06.twtrdns.net
Querying 205.251.195.207 for a.r06.twtrdns.net
Querying 205.251.192.179 for twitter.com
104.244.42.129

If we want to try providing more than one domain.

Final thoughts and thank yous

I had lot of fun doing this challenge, I studied how the DNS works in the past but I’ve been so close to the actual implementation and I will never stop saying that theory is nothing without a good practice.

Some articles I found helpful to understand better how to overcome this challenge:

During this challenge I found extremely helpful pairing with ChatGPT, when using binary protocols having a machine that talk that language is key to dealing with problems and weird behaviors, but be careful using it if you are not sure about what you are doing, sometimes it allucinates and generates funny things. 🥸

Create an application layer load balancer with Golang

2024-02-12T08:00:00+00:00

Since the last time I had too much fun, I wanted to repeat the experiment by taking another Coding Challenge. This time I’m gonna explain how I implemented an Application Load Balancer in Golang.
Let’s go! 🚀

Application Layer Load Balancer

Let’s start by describing what is an Application Layer Load Balancer which is important to understand what it does and where are the complexities behind implementing it.

Usually a Load Balancer sits in front of a group of servers and routes client requests across all of the servers that are capable of fulfilling those requests.

Load Balancers ensure that the traffic is equally distributed between our healthy servers minimising the response time.

There are different types of load balancers, they can work at different levels of the OSI. In this case, I’m gonna be focusing on layer seven of the stack, which will route HTTP requests from clients to a pool of HTTP servers.

Functionalities

Distributes client requests efficiently across multiple servers
Ensures high availability and reliability by sending requests only to servers that are online

flowchart TD
subgraph front
Client <--> LoadBalancer
end
subgraph back
LoadBalancer <-.-> Server1 & Server2 & Server3
end

Goals of this challenge

Build a load balancer that can forward traffic to two or more servers.
Health check the servers
Handle a server going offline
Handle a server coming back online

Preface

Just a friendly reminder that the process I took can be avoided, improved, and of course, wrong.
I’m just telling my story and my Golang-improving journey through this post.
Feel free to give me feedback about it and, if it makes you learn something new or reflect on a topic you never thought about, just let me know ☀️

First Requirement: Creating a simple request forwarder

First of all, I needed to create a simple server, Golang is very powerful and it allows you to do so in a few steps.

In the main for example, we can have:

http.HandleFunc("/", func(writer http.ResponseWriter, request *http.Request) {  
    fmt.Printf("Received request from %s\n", request.RemoteAddr)  
    fmt.Printf("%s / %s\n", request.Method, request.Proto)  
    fmt.Printf("%s / %s\n", request.Method, request.Proto)  
    fmt.Printf("Host: %s\n", request.Host)  
    fmt.Printf("User-Agent: %s\n", request.Header.Get("User-Agent"))  
    fmt.Printf("Accept: %+v\n\n", request.Header.Get("Accept"))  
    fmt.Printf("Replied with a hello message\n")  
    fmt.Fprintf(writer, "Hello From Backend Server")  
})  
err := http.ListenAndServe(":80", nil)  
if err != nil {  
    log.Fatal("Error listening and serve")  
}

And once run we will have our small server listening at the 80 port, logging each request that has been received at the / endpoint.

To verify it you can just call curl http://localhost/ --output - having as a result: Hello From Backend Server.

Of course, it’s not enough to have a service that forwards our requests to specified servers but it’s a start to understand how Golang works.

So to get back to our problem, I started with a unit test:

t.Run("should call the client to forward the request", func(t *testing.T) {  
    req, _ := http.NewRequest(http.MethodGet, "/", nil)  
    resp := httptest.NewRecorder()  
    mockClient := newMockClient()  
    spartimillu := NewSpartimilluServer(mockClient)  
    mockClient.On("ForwardRequest", mock.Anything).Return(&http.Response{  
       Status:     "200 OK",  
       StatusCode: 200,  
       Proto:      "HTTP/1.0",  
       Body:       io.NopCloser(bytes.NewBufferString("dummy body")),  
       Request:    req,  
    })  
    spartimillu.ServeHTTP(resp, req)  
  
    mockClient.AssertExpectations(t)  
    assert.Equal(t, "dummy body", resp.Body.String(), "got %q, want %q", resp.Body.String(), "dummy body")  
})

With this test:

I create a new mockedRequest specifying the method and the endpoint.
Create some mocked response thanks to the httptest package.
Create a mocked client.

type MockClient struct {  
    mock.Mock  
}  
  
func newMockClient() *MockClient { return &MockClient{} }  
  
func (m *MockClient) ForwardRequest(req http.Request) *http.Response {  
    args := m.Called(req)  
    return args.Get(0).(*http.Response)  
}  
  
func (m *MockClient) HealthCheck() {  
    m.Called()  
}

I provide my mocked client as a parameter for my load balancer server that I called SpartimilluServer.

type SpartimilluServer struct {  
    client client.Client  
}  
  
func NewSpartimilluServer(client client.Client) *SpartimilluServer {  
    return &SpartimilluServer{client: client}  
}

I mock the ForwardRequest function to return a custom response with dummy body as a body.
And then call the function under test: ServeHttp.
Checking that the body of the response that we get back is equal to what we are expecting.

The implementation was quite straightforward:

func (s *SpartimilluServer) ServeHTTP(w http.ResponseWriter, r *http.Request) {  
    fmt.Printf("Received request from %s\n", r.RemoteAddr)  
    fmt.Printf("%s / %s\n", r.Method, r.Proto)  
    fmt.Printf("%s / %s\n", r.Method, r.Proto)  
    fmt.Printf("Host: %s\n", r.Host)  
    fmt.Printf("User-Agent: %s\n", r.Header.Get("User-Agent"))  
    fmt.Printf("Accept: %+v\n", r.Header.Get("Accept"))  
  
    resp := s.client.ForwardRequest(*r)  
  
    fmt.Printf("Response from server: %s %s\n\n", resp.Proto, resp.Status)  
  
    body, err := io.ReadAll(resp.Body)  
    if err != nil {  
       http.Error(w, "Error reading the response body", http.StatusInternalServerError)  
    }  
    stringBody := string(body)  
    fmt.Fprint(w, stringBody)  
    fmt.Println(stringBody)  
}

Forward Request

Now we need to implement our client and its functionalities.

Let’s start with an integration test!

t.Run("should forward a GET request to a specific server", func(t *testing.T) {  
    server, address := startTestServer(t, "ok")  
    defer server.Close()  
    conf := NewSpartimilluClientConf([]string{address})  
    client := NewSpartimilluClient(conf)  
    req := httptest.NewRequest(http.MethodGet, "/", nil)  
  
    resp := client.ForwardRequest(*req)  
  
    body := getBodyFromResp(t, resp)  
    assert.Equal(t, http.MethodGet, resp.Request.Method, "got %v, wanted %v", resp.Request.Method, http.MethodGet)  
    assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)  
    assert.Equal(t, "ok", body, "got %v, wanted %v", body, "ok")  
})

So let’s see what is happening here:

Golang gives us the fantastic opportunity to spawn up a local stub server with just a few lines of code using the httptest package and defer it with the Close method:

func startTestServer(t *testing.T, bodyResponse string) (*httptest.Server, string) {  
    t.Helper()  
  
    server := httptest.NewServer(http.HandlerFunc(func(writer http.ResponseWriter, request *http.Request) {  
       fmt.Printf("%s has been called\n", bodyResponse)  
       fmt.Fprint(writer, bodyResponse)  
    }))    
    return server, server.URL  
}

As you can see we are using the NewServer method to spawn up a new server and then set a handler function to get back some info.

I wanted to have flexibility in terms of configuration so I created a SpartimilluClientConf struct:

type SpartimilluClientConf struct {  
    addresses           []string
}  
  
func NewSpartimilluClientConf(addresses []string) SpartimilluClientConf {  
    return SpartimilluClientConf{addresses: addresses}  
}

As you can see it just contains some info like addresses and the health-check endpoint (we will get there later).

I created the client passing the conf as a dependency
Then call the ForwardRequest method passing the request I created.

Some assertions, to verify that everything is as expected. For example to read the body:

func getBodyFromResp(t *testing.T, resp *http.Response) string {  
  t.Helper()  
  
  bodyBytes, err := io.ReadAll(resp.Body)  
  assert.Nil(t, err)  
  return string(bodyBytes)  
}

To then jump into the implementation:

type Client interface {  
    ForwardRequest(req http.Request) *http.Response  
}  
  
type SpartimilluClient struct {  
    conf SpartimilluClientConf  
}  
  
func NewSpartimilluClient(conf SpartimilluClientConf) *SpartimilluClient {  
    return &SpartimilluClient{conf: conf}  
}  
  
func (s *SpartimilluClient) ForwardRequest(req http.Request) *http.Response {  
    switch req.Method {  
    case http.MethodGet:  
       return sendGetRequestToAnotherServer(s.conf.address + req.RequestURI)
    }    
    return nil  
}  
  
func sendGetRequestToAnotherServer(url string) *http.Response {  
    body, err := http.Get(url)  
    if err != nil {  
       log.Fatal("Can't read the response body from the GET request")  
    }
    return body  
}

As you can see I just called the http.Get(url) method emulating the GET request we got.

Try it out

To see everything in action we can just call our main:

func main() {  
    spartimilluClient := client.NewSpartimilluClient(client.NewSpartimilluClientConf("http://localhost:8080"))  
    spartimilluServer := server.NewSpartimilluServer(spartimilluClient)  
    log.Fatal(http.ListenAndServe(":80", spartimilluServer))  
}

In this case I spawned up a separate server listening at the port 8080 and provided it into the configuration.
Then created my SpartimilluServer function handler and used it for my ListenAndServe function.

To spawn up my little server you can use a different main with the code you saw before or just create a directory called for example server8080 containing an index.html file with this content:

  
 lang="en">  
      
        charset="utf-8">  
       </span>Index Page<span class="nt">  
      
       Hello from the web server running on port 8080.

and run in your terminal: python -m http.server 8080 --directory server8080, it will spawn up a python server serving the content of the directory directory8080.

So then you can just call your young load balancer: curl http://localhost/ --output -.

The result will be:

❯ curl http://localhost/ --output -
<!DOCTYPE html>
lang="en">
	<head>
		charset="utf-8">
		Index Page
	
	
		Hello from the web server running on port 8080.
	

flowchart TD
subgraph front
CURL --"1. GET index.html"--> SpartimilluServer
SpartimilluServer --"4. index.html"--> CURL
end
subgraph back
SpartimilluServer-. "2. GET index.html" .-> PythonServer8080
PythonServer8080-. "3. index.html" .-> SpartimilluServer
end

Second Requirement: Distribute traffic with Round Robin

Now that we have our “forwarder” in place, we have to distribute the incoming requests using a scheduling algorithm called “Round Robin”.
It’s quite simple, we just need to distribute the traffic to each server in the list, one after the other and once forwarded to all of them we start back at the beginning of the list.

For example:

Server	Request
A	1, 4
B	2, 5
C	3, 6

Let’s start with another integration test:

t.Run("should forward a GET request to any server using a round robin algorithm", func(t *testing.T) {  
    server1, address1 := startTestServer(t, "server1")  
    defer server1.Close()  
    server2, address2 := startTestServer(t, "server2")  
    defer server2.Close()  
    server3, address3 := startTestServer(t, "server3")  
    defer server3.Close()  
    conf := NewSpartimilluClientConf([]string{address1, address2, address3})  
    client := NewSpartimilluClient(conf)  
    req := httptest.NewRequest(http.MethodGet, "/", nil)  
  
    resp := client.ForwardRequest(*req)  
  
    body := getBodyFromResp(t, resp)  
    assert.Equal(t, http.MethodGet, resp.Request.Method, "got %v, wanted %v", resp.Request.Method, http.MethodGet)  
    assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)  
    assert.Equal(t, "server1", body, "got %v, wanted %v", body, "server1")  
  
    resp = client.ForwardRequest(*req)  
  
    body = getBodyFromResp(t, resp)  
    assert.Equal(t, http.MethodGet, resp.Request.Method, "got %v, wanted %v", resp.Request.Method, http.MethodGet)  
    assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)  
    assert.Equal(t, "server2", body, "got %v, wanted %v", body, "server2")  
  
    resp = client.ForwardRequest(*req)  
  
    body = getBodyFromResp(t, resp)  
    assert.Equal(t, http.MethodGet, resp.Request.Method, "got %v, wanted %v", resp.Request.Method, http.MethodGet)  
    assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)  
    assert.Equal(t, "server3", body, "got %v, wanted %v", body, "server3")  
})

We spawn up 3 servers and set their address in our configuration.

We call ForwardRequest
We expect to contact and then receive an answer from the sever1
We call again ForwardRequest
We expect to contact and then receive an answer from the server2.
We call again ForwardRequest
We expect to contact and then receive an answer from the server3.

Let’s jump into the implementation:

type SpartimilluClient struct {  
    conf    SpartimilluClientConf  
    counter int  
}

func (s *SpartimilluClient) ForwardRequest(req http.Request) *http.Response {  
    var resp *http.Response  
  
    serverIndex := s.counter % len(s.conf.addresses)  
  
    switch req.Method {  
    case http.MethodGet:  
       resp = sendGetRequestToAnotherServer(s.conf.addresses[serverIndex] + req.RequestURI)   
    }  
    s.counter++  
  
    return resp  
}

Using a counter counter and the module operator I implemented a simple round-robin algorithm.

Let’s see how it works:

counter	operation	serverIndex
0	0 % 3	0
1	1 % 3	1
2	2 % 3	2
3	3 % 3	0
4	4 % 3	1
5	5 % 3	2
6	6 % 3	0
…	…	…

And so on. 🤯

Try it out

Our main should look like this:

func main() {  
    spartimilluClient := client.NewSpartimilluClient(client.NewSpartimilluClientConf([]string{  
       "http://localhost:8080",  
       "http://localhost:8081",  
    }))
    spartimilluServer := server.NewSpartimilluServer(spartimilluClient)  
    log.Fatal(http.ListenAndServe(":80", spartimilluServer))  
}

As you can see we specified 2 addresses in our configuration, a server listening at 8080 and one listening at 8081.

Of course, before starting our load balancer we should spawn up our servers.

Let’s create a directory (as we did before for the server8080) but this time called server8081 inside an index.html containing something similar:

  
 lang="en">  
      
        charset="utf-8">  
       </span>Index Page<span class="nt">  
      
       Hello from the web server running on port 8081.

Then we can just run these commands in 2 different shells:

server8080: python -m http.server 8080 --directory server8080
server8081: python -m http.server 8081 --directory server8081

Once both servers are up we can test it out just executing our main and calling our load balancer three times to see how it works:

❯ curl http://localhost/ --output -
<!DOCTYPE html>
lang="en">
	<head>
		charset="utf-8">
		Index Page
	
	
		Hello from the web server running on port 8080.
	

and

❯ curl http://localhost/ --output -
<!DOCTYPE html>
lang="en">
	<head>
		charset="utf-8">
		Index Page
	
	
		Hello from the web server running on port 8081.
	

and

❯ curl http://localhost/ --output -
<!DOCTYPE html>
lang="en">
	<head>
		charset="utf-8">
		Index Page
	
	
		Hello from the web server running on port 8080.
	

flowchart TD
subgraph front
CURL --"1. GET index.html"--> SpartimilluServer
SpartimilluServer --"4/6. index.html"--> CURL
end
subgraph back
SpartimilluServer-. "2. GET index.html" .-> PythonServer8080
PythonServer8080-. "3. index.html" .-> SpartimilluServer
SpartimilluServer-. "4. GET index.html" .-> PythonServer8081
PythonServer8081-. "5. index.html" .-> SpartimilluServer
end

Third Requirement: Implement a Health Check

Now that we implemented the main functionality we have to implement the health check that helps us to always forward the request to a live server.

Let’s start with a unit test:

t.Run("should call the client to do an health check", func(t *testing.T) {  
    req, _ := http.NewRequest(http.MethodGet, "/healthcheck", nil)  
    resp := httptest.NewRecorder()  
    mockClient := newMockClient()  
    spartimillu := NewSpartimilluServer(mockClient)  
    mockClient.On("HealthCheck").Return(&http.Response{  
       Status:     "200 OK",  
       StatusCode: 200,  
       Proto:      "HTTP/1.0",  
       Request:    req,  
    })  
    spartimillu.HealthCheck()  
  
    mockClient.AssertExpectations(t)  
    assert.Equal(t, http.StatusOK, resp.Code, "got %q, want %q", resp.Code, http.StatusOK)  
})

type MockClient struct {  
    mock.Mock  
}  
  
func newMockClient() *MockClient { return &MockClient{} }  
  
func (m *MockClient) ForwardRequest(req http.Request) *http.Response {  
    args := m.Called(req)  
    return args.Get(0).(*http.Response)  
}  
  
func (m *MockClient) HealthCheck() {  
    m.Called()  
}

As we have done before we are checking that the method HealthCheck has been implemented correctly in our SpartimilluServer.

With a very simple implementation:

func (s *SpartimilluServer) HealthCheck() {  
    fmt.Printf("Performing Health Check\n")  
  
    s.client.HealthCheck()  
}

This method would be called every N seconds to check if our servers are still alive.

As usual, let’s continue with an integration test for our client:

t.Run("should perform a health check towards a server", func(t *testing.T) {  
    server1, address1 := startTestServer("server1")  
    defer server1.Close()  
    server2, address2 := startTestServer("server2")  
    defer server2.Close()  
    server3, address3 := startTestServer("server3")  
    defer server3.Close()  
  
    conf := NewSpartimilluClientConf([]string{address1, address2, address3}, "/healthcheck")  
    client := NewSpartimilluClient(conf)  
    req := httptest.NewRequest(http.MethodGet, "/", nil)  
  
    client.HealthCheck()  
    resp := client.ForwardRequest(*req)  
  
    assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)  
    assert.Equal(t, server1.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server1.URL)  
  
    resp = client.ForwardRequest(*req)  
  
    assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)  
    assert.Equal(t, server2.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server2.URL)  
  
    resp = client.ForwardRequest(*req)  
  
    assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)  
    assert.Equal(t, server3.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server3.URL)  
  
    server1.Close()  
  
    client.HealthCheck()  
    resp = client.ForwardRequest(*req)  
  
    assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)  
    assert.Equal(t, server2.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server2.URL)  
  
    server2.Close()  
    client.HealthCheck()  
    resp = client.ForwardRequest(*req)  
  
    assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)  
    assert.Equal(t, server3.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server3.URL)  
  
    resp = client.ForwardRequest(*req)  
    assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)  
    assert.Equal(t, server3.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server3.URL)  
})

Let’s see what we did:

As usual, we spawned up 3 servers adding them to our servers list
We call our HealthCheck method that supposed to update our list of available servers
We call the ForwardRequest method checking then if we contact the right server.
The first 3 times we checked if we had contacted iteratively each server.
We shut down the server1
We re-do the health-check
We re-call our ForwardRequest method but this time our method should call the server2 since the server1 has been shut down.
We do the same thing then for the server2, expecting then to call the only one left server3.

Here is the implementation:

type Client interface {  
    ForwardRequest(req http.Request) *http.Response  
    HealthCheck()  
}

type SpartimilluClient struct {  
    conf           SpartimilluClientConf  
    counter        int  
    healthyServers map[string]bool  
}

We added to our struct a map of healthyServers that will be updated by our HealthCheck method.

func NewSpartimilluClient(conf SpartimilluClientConf) *SpartimilluClient {  
    return &SpartimilluClient{conf: conf, healthyServers: make(map[string]bool)}  
}

func (s *SpartimilluClient) HealthCheck() {  
    for _, address := range s.conf.addresses {  
       resp, err := http.Get(address)  
       if err == nil && resp.StatusCode == http.StatusOK {  
          s.healthyServers[address] = true  
       } else {  
          s.healthyServers[address] = false  
       }  
    }
}

We iterate over the list of addresses and for each server we do a GET request to check if we got an OK, if so we update our healthyServers map having as a key the address and a boolean value as a value.

And this is the implementation of our ForwardRequest method:

func (s *SpartimilluClient) ForwardRequest(req http.Request) *http.Response {  
    if len(s.healthyServers) == 0 {  
       s.HealthCheck()  
    }  
    index := s.counter % len(s.conf.addresses)  
    address := s.conf.addresses[index]  
    s.counter++  
  
    if s.healthyServers[address]{  
       switch req.Method {  
       case http.MethodGet:  
          return sendGetRequestToAnotherServer(address + req.RequestURI)
      }    
   }  
   return s.ForwardRequest(req)  
}

Here we make sure that we perform a health check at least once before we start forwarding requests around.
Then we check if the server we want to contact is healthy then we contact it otherwise we do a recursive call to call the next one in the list.
The recursion is not very efficient but it works nicely for now, we will improve it in the next step.

Try it out

Our main now looks like this:

func main() {
  spartimilluClient := client.NewSpartimilluClient(client.NewSpartimilluClientConf([]string{  
       "http://localhost:8080",  
       "http://localhost:8081",  
    }, "/healthcheck"))  
   spartimilluServer := server.NewSpartimilluServer(spartimilluClient)  
  
  ticker := time.NewTicker(5 * time.Second)
  go func() {
      for {  
          select {  
          case <-ticker.C:  
             spartimilluServer.HealthCheck()  
          }       
      }
  }()
  log.Fatal(http.ListenAndServe(":80", spartimilluServer))  
}

This is the first raw version of our async HealtCheck.
We call asynchronously our HealthCheck every 5 seconds using a ticker.

To try it out we can reproduce the steps above spawning our 2 stubbed servers and once we run our load-balancer we can just kill one of the two to check if our load-balancer every 5 seconds decides to contact the only one that is still alive.

flowchart TD
subgraph front
CURL --"1. GET index.html"--> SpartimilluServer
SpartimilluServer --"4. index.html"--> CURL
end
subgraph back
PythonServer8080
SpartimilluServer-. "2. GET index.html" .-> PythonServer8081
PythonServer8081-. "3. index.html" .-> SpartimilluServer
end

Fighting concurrency

Another thing we have to make sure to handle is the concurrency.
As you could have seen our HealthCheck function is modifying a shared map with ForwardRequest, and it can cause concurrency issues since it can be accessed by both functions at the same time. We can do that using a mutex.

type SpartimilluClient struct {  
    conf           SpartimilluClientConf  
    counter        int  
    healthyServers map[string]bool  
    mu             sync.Mutex  
}

Here is our HealthCheck implementation:

func (s *SpartimilluClient) HealthCheck() {  
    for _, address := range s.conf.addresses {  
       resp, err := http.Get(address)  
  
       s.mu.Lock()  
       if err == nil && resp.StatusCode == http.StatusOK {  
          s.healthyServers[address] = true  
       } else {  
          s.healthyServers[address] = false  
       }  
       s.mu.Unlock()  
    }
}

Every time we want to access our healthyServers map we lock it down to be sure that anybody else can do it before releasing it back.

And the ForwardRequestone:

func (s *SpartimilluClient) ForwardRequest(req http.Request) *http.Response {  
    for {  
       if len(s.healthyServers) == 0 {  
          s.HealthCheck()  
       }  
       s.mu.Lock()  
       index := s.counter % len(s.conf.addresses)  
       address := s.conf.addresses[index]  
       s.counter++  
  
       if s.healthyServers[address] == true {  
          s.mu.Unlock()  
          switch req.Method {  
          case http.MethodGet:  
             return sendGetRequestToAnotherServer(address + req.RequestURI)  
          }       
       }       
      s.mu.Unlock()  
      time.Sleep(100 * time.Millisecond)  
    }
}

Here I did a bit of refactoring and of course, used the Lock to handle concurrency issues. I also removed the recurse for a for/infinite loop with a sleep time to re-try contacting our servers.

Try it out

After a refactoring our main looks like this:

func main() {  
    const seconds = 1 * time.Second  
  
    spartimilluClient := client.NewSpartimilluClient(client.NewSpartimilluClientConf([]string{  
       "http://localhost:8080",  
       "http://localhost:8081",  
    }, "/healthcheck"))  
    spartimilluServer := server.NewSpartimilluServer(spartimilluClient)  
  
    go doEvery(seconds, spartimilluServer.HealthCheck)  
  
    log.Fatal(http.ListenAndServe(":80", spartimilluServer))  
}  
  
func doEvery(d time.Duration, f func()) {  
    ticker := time.Tick(d)  
    for range ticker {  
       go f()  
    }
}

A this time you should have your load balancer switching from one server to another and performing health checks correctly.

Final Thoughts

It has been a quite fun challenge, iteratively I built up my application load balancer starting from a small forward to then adding a more complex logic facing up some nice challenges like evolving my code to embrace the change and how to do integration tests spawning-up stub servers.

Of course, this is a very basic load balancer, it can be improved and extended but I’m satisfied with it for now.

You can find the repository with the code in my Github profile, https://github.com/dlion/spartimillu.

What do you think about my solution? Any feedback would be appreciated and of course, if you make your solution don’t be shy and share it with me too!

Happy Coding!

Create a WC clone with Golang

2024-01-29T08:00:00+00:00

I’ve been using Golang in the past months, and I was very happy about it, this language makes me feel entertained and productive and the same time. Studying for me is a lifelong journey, and with that in mind, I decided to keep using it for another challenge/project that I found quite simple but interesting.
You’ll see! I spent some of my free time on it and today, I’d like to share with you which route I took to accomplish it.
Let’s start!

Intro and Coding Challenge

Golang has always been one of my favorite languages and I’ve been using it for a few months so far, after my Take 3 experience I decided to keep studying it during my free time, and since I like hands-on projects, I used it to create a side-pet-challenge/project.

Coding Challenges

I recently discovered Coding Challenges, a website full of hands-on coding challenges that’s possible to take in different languages, I chose mine: Go.
Sometimes I struggle to find new ideas and this website helped me a lot with that.

WC

To start with something simple I decided to implement WC, the famous Word Count unix tool.
To learn more about it, run man wc in your terminal but what it essentially does is count words, lines, characters, and bytes of a specific file or pipe stream.
From this very high-level point of view, it seems quite simple but digging deeper you will see that it’s not as simple as you could have thought at the beginning.

Preface

Just a friendly reminder that the process I took can be avoided, improved, and of course, wrong.
I’m just telling my story through this project and my Golang improving journey.
Feel free to give me feedback about it and, if it makes you learn something new or reflect on a topic you never thought about, just let me know ☀️

First Requirement: Count bytes

Starting from scratch in Go is quite simple, so I just created my repo, opened my IntelliJ Goland IDE, and, created a simple hello world ready to jump into my first requirement implementation.
The first requirement is to have a small functionality, just counting the number of bytes from a specific file.
Using the file that has been provided the result of this command should be:

>./gowc -c test.txt
  342190 test.txt

Let’s see what we got from that:

Input:

We have a -c parameter which is the way we activate the count bytes functionality
We pass a test.txt which is the file we want to count from Output:
Space
Number of bytes
Name of the file that has been read

Through my repo’s commit you can see the history of my changes, I started with something completely different (like mocking a filesystem using testify/mock) ending up with a bunch of simple unit tests:

func TestWcBytesReader(t *testing.T) {  
    t.Run("Count reads 0 bytes", func(t *testing.T) {  
       dummyContent := make([]byte, 0)  
       r := NewWcBytesReader()  
  
       currentBytes := r.Count(dummyContent)  
  
       expected := int64(0)  
       assert.Equal(t, expected, currentBytes, "Got %d, wanted %d", currentBytes, expected)  
    })
      
    t.Run("Count reads 1 byte", func(t *testing.T) {  
       dummyContent := make([]byte, 1)  
       r := NewWcBytesReader()  
  
       currentBytes := r.Count(dummyContent)  
  
       expected := int64(1)  
       assert.Equal(t, expected, currentBytes, "Got %d, wanted %d", currentBytes, expected)  
    })
      
    t.Run("Count reads multiple bytes", func(t *testing.T) {  
       dummyContent := make([]byte, 100)  
       r := NewWcBytesReader()  
  
       currentBytes := r.Count(dummyContent)  
  
       expected := int64(100)  
       assert.Equal(t, expected, currentBytes, "Got %d, wanted %d", currentBytes, expected)  
    })
 }

The implementation as you can imagine wasn’t a big deal for this feature:

func (w WcBytesReader) Count(content []byte) int64 {  
    return int64(len(content))  
}

We will skip for now how I used it in the main, if you want to try out the parameter part you can just use flag and parse that calling the Count function from there once NewWcBytesReader has been called.

Second Requirement: Count Lines

The second requirement was to support the command line option -l that outputs the number of lines in a file.
The CLI input/output should be:

>./gocw -l test.txt
    7145 test.txt

Let’s see what we got from that:

Input:

We have a -l parameter which is the way we activate the count bytes functionality
We pass a test.txt which is the file we want to count from Output:
Space
Number of lines
Name of the file that has been read

Here are some unit tests I wrote:

t.Run("Count returns 0 lines with an empty file", func(t *testing.T) {  
    dummyFile := []byte("")  
    r := NewWcLinesReader()  
  
    currentLines := r.Count(dummyFile)  
  
    expected := int64(0)  
    assert.Equal(t, expected, currentLines, "Got %d, wanted %d", currentLines, expected)  
})  
  
t.Run("Count returns 1 lines with just one line", func(t *testing.T) {  
    dummyFile := []byte("Dummy String")  
    r := NewWcLinesReader()  
  
    currentLines := r.Count(dummyFile)  
  
    expected := int64(1)  
    assert.Equal(t, expected, currentLines, "Got %d, wanted %d", currentLines, expected)  
})  
  
t.Run("Count returns 3 lines with a multi lines file content", func(t *testing.T) {  
    dummyFile := []byte("Line 1\nLine 2\nLine 3")  
    r := NewWcLinesReader()  
  
    currentLines := r.Count(dummyFile)  
  
    expected := int64(3)  
    assert.Equal(t, expected, currentLines, "Got %d, wanted %d", currentLines, expected)  
})  
  
t.Run("Count returns 3 lines with a multi lines file content with a trailing empty line", func(t *testing.T) {  
    dummyFile := []byte("Line 1\nLine 2\nLine 3\n")  
    r := NewWcLinesReader()  
  
    currentLines := r.Count(dummyFile)  
  
    expected := int64(3)  
    assert.Equal(t, expected, currentLines, "Got %d, wanted %d", currentLines, expected)  
})

And here is the implementation:

func (w WcLinesReader) Count(content []byte) int64 {  
    if len(content) == 0 {  
       return int64(0)  
    }  
    lines := strings.Split(string(content), "\n")  
    if lines[len(lines)-1] == "" {  
       return int64(len(lines) - 1)  
    }  
    return int64(len(lines))  
}

Third Requirement: Count Words

The third requirement was to support the command line option -w that outputs the number of words in a file.
The CLI input/output should be:

>./gocw -w test.txt
   58164 test.txt

Let’s see what we got from that:

Input:

We have a -w parameter which is the way we activate the count words functionality
We pass a test.txt which is the file we want to count from Output:
Space
Number of words
Name of the file that has been read

The unit tests I wrote:

t.Run("Count returns 0 if the file doesn't have words", func(t *testing.T) {  
    dummyFile := []byte("")  
    r := NewWcWordsReader()  
  
    nWords := r.Count(dummyFile)  
  
    expected := int64(0)  
    assert.Equal(t, expected, nWords, "Got %d, wanted %d", nWords, expected)  
})  
  
t.Run("Count returns 1 if the file have just 1 word", func(t *testing.T) {  
    dummyFile := []byte("Dummy")  
    r := NewWcWordsReader()  
  
    nWords := r.Count(dummyFile)  
  
    expected := int64(1)  
    assert.Equal(t, expected, nWords, "Got %d, wanted %d", nWords, expected)  
})  
  
t.Run("Count returns 3 if the file have 3 words", func(t *testing.T) {  
    dummyFile := []byte("Dummy Word Here")  
    r := NewWcWordsReader()  
    nWords := r.Count(dummyFile)  
  
    expected := int64(3)  
    assert.Equal(t, expected, nWords, "Got %d, wanted %d", nWords, expected)  
})

As you can see I maintained always the same style, starting with a simpler scenario, and moving up to a more complex one.

The implementation:

func (w WcWordsReader) Count(content []byte) int64 {  
    words := strings.Fields(string(content))  
  
    return int64(len(words))  
}

Here instead of spending lots of time understanding what type of words I want to support, how Golang interprets and counts them, considering corner cases, Unicode characters, etc. I decided to use the strings.Fields method, according to the doc:

Fields splits the string s around each instance of one or more consecutive white space characters, as defined by unicode.IsSpace, returning a slice of substrings of s or an empty slice if s contains only white space. – Golang Doc

My goal wasn’t to reinvent the wheel, and in some contexts/domains (i.e. security) you shouldn’t too.

Fourth Requirement: Count Characters

The fourth requirement was to support the command line option -m that outputs the number of characters in a file.
The CLI input/output should be:

>./gocw -m test.txt
  339292 test.txt

Let’s see what we got from that:

Input:

We have a -m parameter which is the way we activate the count chars functionality
We pass a test.txt which is the file we want to count from Output:
Space
Number of chars
Name of the file that has been read

The unit tests I wrote:

func TestWcCharsReader(t *testing.T) {  
    t.Run("Count reads 0 chars", func(t *testing.T) {  
       dummyFile := []byte("")  
  
       r := NewWcCharsReader()  
       nChars := r.Count(dummyFile)  
  
       expected := int64(0)  
       assert.Equal(t, expected, nChars, "Got %d, wanted %d", nChars, expected)  
    })
    
    t.Run("Count reads 1 char", func(t *testing.T) {  
       dummyFile := []byte("a")  
  
       r := NewWcCharsReader()  
       nChars := r.Count(dummyFile)  
  
       expected := int64(1)  
       assert.Equal(t, expected, nChars, "Got %d, wanted %d", nChars, expected)  
    })
      
    t.Run("Count reads multiple chars", func(t *testing.T) {  
       dummyFile := []byte("abc")  
  
       r := NewWcCharsReader()  
       nChars := r.Count(dummyFile)  
  
       expected := int64(3)  
       assert.Equal(t, expected, nChars, "Got %d, wanted %d", nChars, expected)  
    })
      
    t.Run("Count reads multiple chars included unicode ones", func(t *testing.T) {  
       dummyFile := []byte("🚀")  
  
       r := NewWcCharsReader()  
       nChars := r.Count(dummyFile)  
  
       expected := int64(1)  
       assert.Equal(t, expected, nChars, "Got %d, wanted %d", nChars, expected)  
    })
}

I still used the same format as before, the last test is different since it allows us to test Unicode characters (in this case an emoji). As you might know, Unicode characters are counted differently, if you want to know more about it, read this article: https://tonsky.me/blog/unicode/

And the following implementation:

func (w WcCharsReader) Count(content []byte) int64 {  
    return int64(utf8.RuneCount(content))  
}

The utf8.RuneCount method allows me to count the number of runes in a string considering utf-8s also.

RuneCount returns the number of runes in p. Erroneous and short encodings are treated as single runes of width 1 byte. – Golang Doc

Fifth Requirement: Default options

In this step, we should support the default option which means: no options have been provided which will be translated as we activate the -c, -l, and -w options.
The CLI input/output should be then:

>./gocw test.txt
    7145   58164  342190 test.txt

Since we have already implemented all functionalities we just need to rearrange the way we activate them.
At first look, it seems that having something like:

if *flagBytes == true { ... }
else if *flagLines == true { ... }
else if *flagWords == true { ... }
else { activeDefaultOptions() }

Might work fine but I wanted to improve it a bit, I didn’t like the idea that if I wanted to add new functionality I needed to touch/duplicate lots of code.
In the beginning, I got confused and I thought that each option needed to have a filename attached having something like this -c filename.txt -w filename.txt, but then I realized that my solution would lead to a very complex solution since flag doesn’t support “empty” flags, if not passing some default values which in case of strings would have been difficult.
So I reverted my design choice to a simpler one, using boolean flags instead.

Parameters

I don’t like having everything in the main, so I created a parameters dir and wrote some unit tests:

func TestParameters(t *testing.T) {  
    t.Run("Parameters have been provided", func(t *testing.T) {  
       os.Args = []string{"wc", "-l", "text.txt"}  
  
       actual := HasProvided()  
  
       assert.Truef(t, actual, "expected %t, got %t", true, actual)  
    })
      
    t.Run("Parameters haven't been provided", func(t *testing.T) {  
       os.Args = []string{"wc"}  
  
       actual := HasProvided()  
  
       assert.Falsef(t, actual, "expected %t, got %t", false, actual)  
    })
      
    t.Run("Get filename from parameter provided", func(t *testing.T) {  
       os.Args = []string{"wc", "-l", "text.txt"}  
  
       actual := GetFilename()  
  
       expected := "text.txt"  
       assert.Equal(t, expected, actual, "expected %t, got %t", expected, actual)  
    })
      
    t.Run("Get true if at least one flag has been passed", func(t *testing.T) {  
       getBooleanPointer := func(b bool) *bool { return &b }  
       flags := map[string]*bool{  
          "c": getBooleanPointer(false),  
          "d": getBooleanPointer(true),  
          "e": getBooleanPointer(false),  
       }  
       actualName, actualBool := HaveBeenPassed(flags)  
  
       expectedName := "d"  
       assert.Equal(t, expectedName, actualName, "expected %t, got %t", expectedName, actualName)  
       assert.Truef(t, actualBool, "expected %t, got %t", true, actualBool)  
    })
      
    t.Run("Get false if no flags have been passed", func(t *testing.T) {  
       getBooleanPointer := func(b bool) *bool { return &b }  
       flags := map[string]*bool{  
          "c": getBooleanPointer(false),  
          "d": getBooleanPointer(false),  
          "e": getBooleanPointer(false),  
       }  
       actualName, actualBool := HaveBeenPassed(flags)  
  
       expectedName := ""  
       assert.Equal(t, expectedName, actualName, "expected %t, got %t", expectedName, actualName)  
       assert.Falsef(t, actualBool, "expected %t, got %t", true, actualBool)  
    })
 }

Check if we’ve got parameters

The functions I implemented are:

func HasProvided() bool {  
    return len(os.Args) > 1  
}

It gives to me if parameters have been provided.

Get the filename from the command line

func GetFilename() string {  
    return os.Args[len(os.Args)-1]  
}

It gives me the last parameter entry which should be the filename.

Check if a specific flag has been passed

func HaveBeenPassed(flags map[string]*bool) (string, bool) {  
    for flagName, flagValue := range flags {  
       if *flagValue == true {  
          return flagName, true  
       }  
    }    return "", false  
}

It gives me the first flag that has been activated.

Flag initialization and parsing

To initialize and get the flags I also wrote a function:

func GetFlags() map[string]*bool {  
    flags := map[string]*bool{  
       BytesFlag: flag.Bool(BytesFlag, false, "Count bytes of the file"),  
       LinesFlag: flag.Bool(LinesFlag, false, "Count lines of the file"),  
       WordsFlag: flag.Bool(WordsFlag, false, "Count words of the file"),  
       CharsFlag: flag.Bool(CharsFlag, false, "Count chars of the file"),  
    }  
    flag.Parse()  
  
    return flags  
}

It creates a map with strings as key (specific const variables), and a bool which is the boolean value that flag sets following the CLI parameters.

Const variables

It refers to the const variables, which are the keys of the GetFlags map and our parameters:

const (  
    BytesFlag = "c"  
    LinesFlag = "l"  
    WordsFlag = "w"  
    CharsFlag = "m"  
)

So if tomorrow we need to add a new parameter we can just add a new Constant, add it to our map with the respective flag.Bool call, and everything is encapsulated inside the parameters.go file.

So going back to our main we have to get the input combining the functions above:

func getInput() ([]byte, string) {
	if parameters.HasProvided() {  
		 filename := parameters.GetFilename()  
		 return readFile(filename), filename  
	}
    return make([]byte, 0), EmptyString  
}

I created a private function to verify if some parameters have been passed, and then I got the file name and read it, returning the content.

Then after setting and getting the flags using the function parameters.GetFlags() I initialized my readers using the function reader.InitializeReaders() which instantiates all readers storing them into a map of strings-WcReaderManager:

func InitializeReaders() map[string]WcReaderManager {  
    return map[string]WcReaderManager{  
       parameters.BytesFlag: bytesReader.NewWcBytesReader(),  
       parameters.LinesFlag: linesReader.NewWcLinesReader(),  
       parameters.WordsFlag: wordsReader.NewWcWordsReader(),  
       parameters.CharsFlag: charsReader.NewWcCharsReader(),  
    }
 }

Once initialized I verify, calling the function parameters.HaveBeenPassed(flags) if any parameter has been passed.

If so, I call reader.CountWithSpecificReader(initializedReaders[flagNamePassed], input) which the implementation is:

func CountWithSpecificReader(specificReader WcReaderManager, input []byte) int64 {  
  return specificReader.Count(input)  
}

It gets a specificReader due to the flag that has been passed and the input which is the content of the file.
It calls the function Count returning the output.

Otherwise we call the function reader.CountBytesWordsAndLines(initializedReaders, input) which uses the initialized readers that have been saved into the map to count the input for the 3 default options: bytes, words, and lines. The implementation is:

func CountBytesWordsAndLines(readers map[string]WcReaderManager, input []byte) (int64, int64, int64) {  
  return readers[parameters.BytesFlag].Count(input),  
     readers[parameters.WordsFlag].Count(input),  
     readers[parameters.LinesFlag].Count(input)  
}

I was able to accomplish that thanks to the WcReaderManager interface:

type WcReaderManager interface {  
    Count(content []byte) int64  
}

If you are curious about the approach I used, have a look into the Strategy Pattern.

Final step

The final step is about supporting reading from standard input if no filename is specified.

The CLI input/output should be:

>cat test.txt | ./gocw -l
    7145

To do that I created a directory called pipeline, containing useful functions to solve the problem, here are the unit tests I wrote:

func TestPipeline(t *testing.T) {  
    t.Run("HasInput should return false if an input hasn't come from pipeline", func(t *testing.T) {  
       actual := HasInput()  
  
       assert.Falsef(t, actual, "expected %t, got %t", true, actual)  
    })    
    
    t.Run("HasInput should truw if an input comes from pipeline", func(t *testing.T) {  
       r, w, _ := os.Pipe()  
       _, _ = w.Write([]byte("Hello"))  
       _ = w.Close()  
       os.Stdin = r  
       defer func(v *os.File) { os.Stdin = v }(os.Stdin)  
  
       actual := HasInput()  
  
       assert.Truef(t, actual, "expected %t, got %t", true, actual)  
    })
 }

And here is the implementation:

func HasInput() bool {  
    f, _ := os.Stdin.Stat()  
    return (f.Mode() & os.ModeCharDevice) == 0  
}  
  
func ReadInput() []byte {  
    input, err := io.ReadAll(os.Stdin)  
    if err != nil {  
       log.Fatalf("Error reading the pipeline: %v", err)  
    }    return input  
}

I check if the information that we get from the standard input is coming from a pipe operator, and then I just updated the getInput function used before:

func getInput() ([]byte, string) {  
    const EmptyString = ""  
  
    if pipeline.HasInput() {  
       return pipeline.ReadInput(), EmptyString  
    }  
  
    if parameters.HasProvided() {  
       filename := parameters.GetFilename()  
       return readFile(filename), filename  
    }  
  
    return make([]byte, 0), EmptyString  
}  
  
func readFile(filename string) []byte {  
    input, err := os.ReadFile(filename)  
    if err != nil {  
       log.Fatalf("Error reading the file: %v", err)  
    }    return input  
}

Directory Structure

- gowc
	- parameters
	- pipeline
	- reader
		- bytes
		- chars
		- lines
		- words
	- testdata

In the parameters directory I’ve put everything related to parameters, which means the definition of the parameters and some helpers.
In the pipeline directory I’ve put everything about the way to pass information through the pipe operator and the standard input, like identifying when it happens and how to read from it
In the reader directory I have everything related to my readers, the main reader file contains the interface WcReaderManager which has a Count function, and some helpers to initialize the readers.
The testdata directory was just a place where to store the sample file.

It required a bit of refactoring to get into this shape, inside the reader directory there is the core of my application beginning with the reader.go file, it contains the WcReaderManager interface and a bunch of function which help me to initialize and call in specific ways my readers.
Under each bytes, chars, lines, and words directory there is the actual implementation that will be executed when needed.

As you can see it helps me to isolate and make it clear what each function belongs to. Having calls like parameters.HasProvided(), pipeline.HasInput(), reader.CountWithSpecificReader(...) really improve the reading and the understanding, I like this structure for this reason.

Final Thoughts

Of course the development process wasn’t this smooth, there were trials and errors here and there, as it’s supposed to be.

Perfect is the enemy of good enough

I tried to face this coding challenge by reading one requirement after the other and doing a step-by-step evolution in my codebase.
It means that I’ve had to change and adapt my code to the new requirements.
Yeah, I faced this challenge like that it was a real scenario and this is the way I think is the best way to learn: on the job.

In the beginning, it seemed quite simple, I just needed to do some counting here and there, but then at every green test, I felt the urge to clean and refactor my codebase.

Friendly reminder, refactoring should be part of your definition of done.

With each iteration, my codebase evolved into something more clear and thanks to my testing strategy I could do it in no time.
Sure, I had to move from different design decisions to others but that’s normal in a codebase, consider that whenever you touch some code. Sooner or later you will need to change that and how coupled it is to other components that will make the difference in the long run.

If you want to have a look at the code, you can find it on my Github Profile: https://github.com/dlion/gowc.

So what do you think about my solution?
Did I miss something? Can I improve it? Can it be more idiomatic?

Of course this solution can be definitely improved, and the exercise wasn’t about reading large files (given the example file).
In that case I would have implemented the reader in a different way in order to not have the entire file in memory and so on.

Implement your solution and let me know what you think about this challenge.

Happy Coding!

PointerPodcast - Buildpacks and Extreme Programming

2024-01-12T08:00:00+00:00

I decided to expose myself more and more, I like to share my opinions and experiences (you can say reading this blog), and this time I decided to attend as a interviewee for a well known Podcast in Italy talking about something I’ve done for 3 months, are you curious about it?

A few weeks ago, I’ve been interviewed by PointerPodcast about my experience during my Take 3 and how I work in Tanzu Labs.

It was quite fan, even tho in some points I feel there will be more and more to talk about, I just barely scratched the surface.
Yeah sorry for any english speaker, it’s in Italian 🇮🇹

You can find the episode on Spotify and Apple Podcast here:

Buildpacks - 3 months later

2023-11-16T08:00:00+00:00

Since the beginning of my career in this industry, I’ve always been fascinated by the open-source world.
After so many years of contributing on my own, this year, I finally joined for 3 months an open-source team within VMware and discovered how an open-source team works.
In this article, I’m going to tell you more about this open-source journey and how it went. Let’s go! 🚀

Take 3

I’d like to start with how I ended up working full-time on an open-source project, and to do so, I have to describe what a VMware Take3 initiative is.

VMware, the company I currently work for, gives everyone the fantastic opportunity to join another team within the company for a limited amount of time, usually 3 months (that’s why the 3 😉).

Every team that needs some help or that is open to accepting temporary new joiners can publish an opportunity in an internal platform. Sometimes they require specific skills, proficiency in a particular stack, or just people with the motivation to learn new things.

Everyone who has the approval can apply to those opportunities, and have a conversation with the manager who published it to verify whether that person might be a good fit or not.
As you can imagine I went through this process early this July and, I applied for a Take3 to join a small open-source team called CNB - Cloud Native Buildpacks.

What do Buildpacks do?

TLDR: Buildpacks turn your code into OCI-compliant containers. They examine your source code, build it, and create a container image with all the required dependencies to run your application. ⚡️

Using Dockerfiles can be exhausting, you have to make lots of decisions like which base image you want to use and which version, being sure that all your application dependencies are ok with it. After that, you need to bring additional dependencies, and runtimes, build your application, and finally optimize all these operations to have an optimized container.
Cloud Native Buildpacks on the other hand would take care of all these steps, at least for most of the common use cases.
Your container also needs to be maintained over time, using your Dockerfiles you don’t have a real separation between the base image, the runtime, your dependencies, and your application so updating the image would require rebuilding it every time.

Cloud Native Buildpacks create different layers that can be swapped with new versions like Legos. 🧱
If you want to know more about it, have a look at the Cloud Native Buildpacks website.

Cloud Native Buildpacks - The team

CNB stands for Cloud Native Buildpacks, it’s a small team composed of amazing people who are spending their daily time on repositories like Buildpacks/Pack and Buildpacks/Lifecycle.
Their main duty is to help the Buildpacks maintainers and the Buildpacks community, they take care of the issues, develop new features, attend numerous working group meetings, and help release new versions.

How does the Cloud Native Buildpacks Open Source team work?

The Open Source world is HUGE, and a standard to follow doesn’t exist yet, so every experience can be unique. 👈🏻
The team’s favorite way of working is to be async as much as possible due to the different timezones involved and then take advantage of the overlapped hours to pair/sync if necessary.
We were using an async standup where we could raise blockers and/or keep the rest of the team up-to-date.

During the week we had different meetings:

Internal Sync with other teams
- In the internal sync meetings a representative of the team kept other teams informed about the work in progress and possible new initiatives and events.
Working Groups
- During these meetings, we catch up with maintainers, other collaborators, and people with specific requests and problems. This meeting was one of the most interesting ones for me, it gave me the occasion to meet people working for other companies, from Google to Bloomberg.
Iteration Planning Meetings
- During these meetings, each of us talked about the tasks in progress, and their status, and asked for help and possible future initiatives.
Knowledge sharing meetings
- These meetings were about sharing interesting stuff, they could be very effective during the onboarding process or just to dive deeper into some fun topics.
Social meetings
- Working remotely doesn’t mean you have to be alone the whole time, these meetings were a nice opportunity to know each other better, talk about anything, and play some nice games. It was very refreshing knowing that behind such talented people, there were extraordinary human beings.
Pairing Sessions
- Pairing was crucial to get onboarded and help faster. Collaborating so closely with other team members helped me to get confident with the codebase and to answer many questions that I -could- definitely have

How a day within the team looked like

Most of my mornings were free from meetings due to the different time zones, so I could focus on getting things done. 🚀
Getting things done meant continuing my tasks in progress, reviewing issues opened, and reviewing PRs. Being in a different timezone allowed me to interact efficiently with whom was in my timezone as well.
In the afternoon it’s when I had most of my meetings, I could catch up with the rest of the team and join the necessary meetings explained above, besides, of course, our pairing sessions.

Expectations

The expectations that have been set for me was to be able to deliver one feature at the end of the 3 months of the Take3.

My todo list was kind of similar:

First week: get onboarded
- Learn about the project, the team, the codebase, etc.
Second week: try to work on a first good issue
- Usually, they are very simple issues
3 Months: Deliver a feature
- E2E Ownership

Golang

Despite my long experience jumping from one stack to another, I’ve had just one occasion to push any written Golang code to prod so far, and it was a few years ago. Now, the challenge was to re-learn it and, to kind of push it to prod.
Kind of because our “prod” was a “release” state, not a real prod env, since the software I was working on was mainly a CLI app.
Honestly, I liked Golang and I was able to use it quite effectively in no time. I liked to work with this language and I would like to work again with it in the future. 👀

First Feature

Thanks to my years of experience jumping from one stack to another as a consultant Extreme Programmer, I could re-learn the basics of the Go language quite quickly and, I’ve got my first feature merged within one week.

I picked up one first good issue: https://github.com/buildpacks/pack/issues/1800 and created a PR that has been merged within 1 week: https://github.com/buildpacks/pack/pull/1810. 🥳
At that moment, I was officially a paid Open Source contributor. 🚀

Open Source Contributor

I started contributing to the Open Source when I was young, I think I was 17 years old, this is my first contribution ever: https://github.com/toshidex/DefollowNotify/pull/1.
Then I ‘ve been an Hacktoberfest contributor: 2017, 2018, 2019, 2020.
And finally, after so many years, I was a paid open-source contributor. I was so happy that I had to share it on Linkedin too.

Back to Labs

Months have passed now, and I’m back at Tanzu Labs, happy to have had the time to help this amazing team and learn about Buildpacks internals.
It was an amazing experience I’d try again in the future.

Some of my contributions:

I led and facilitated a user journey mapping session with some our users about a flatten feature, it was really interesting getting feedback from our users, shaping the functionality based on their feedback and of course, I was happy to have put my facilitation skills to help the project.

Outcome

I over-achieved the initial goal of delivering just one small feature, but I’ve got passionate about the project and, I couldn’t help myself delivering just one small thing, I wanted to have an impact and deliver as much value as possible to our users.

Become a contributor

If you want to start becoming a contributor don’t be shy and take any of the first-good-issue issues that already exist on any of the repos I linked above.
Ask for help, the team is there to support you, and they will try to unblock you, they are the best! 💪🏻

Thank you

I’d like to thank:

VMware for the amazing opportunity.
The CNB team has welcomed, helped, and supported me all the time: Natalie, Juan, Navdeep, Nanci and Joe.
Labs managers that have approved my request for the Take3

How I Passed my AWS Certified Developer Exam

2023-06-08T08:00:00+00:00

A few weeks ago I passed my AWS Certified Developer Associate (DVA-02) exam and I thought it would have been nice to document how heck I accomplished to do it on a first try, and of course provide some hints to whoever wants to do the same. 🧑🏻‍💻👨🏻‍💻

But why did you spend your time getting the certification?

I’d like to start by saying that I’m not a big fan of getting as many certifications as possible, I’m more affected by the greedy learner pathology which pushed me forward on this route.

This indeed is my first real certification.

I started to study AWS just for fun and to desire to become a better professional. I wasn’t really interested to get any certification.
I started it for fun and it became even more fun over time, digging deeper into AWS services and use cases, so at the end of the path/course I felt that after all the effort I could try to challenge myself even more by getting the certification.
I LOVE GOING OUTSIDE MY COMFORT ZONE

Did I need it? No.
Do you need it? Probably not.

It was just a matter of challenging myself, nothing more 🤭 so my advice is to enjoy the journey and learn as much as possible but just for your own sake, it will give you the extra motivation you need to pass the exam at the first try. 🎯

Why did I focus on AWS?

I work at VMware Tanzu Labs right now, and we often jump from one project to another; it’s always fun and it gives the possibility to work on different things and don’t get bored. 🤩
One thing that I noticed was that during my career I’ve always been facing AWS architectures at least once per year.

At least one client had their ecosystem on AWS.

Working for a Multi-Cloud company gives me the possibility to have a T-shape skills-set, focusing more on the how than the what; I mean, I worked on Azure and GCP as well, but to be honest the most fun ecosystems I worked on were on AWS. 🤭

So I felt that I needed, for my career and for my interest (📚) to fill out some gaps that I had on AWS to become a better engineer and a better professional. 💪🏻

Where did I study?

A Cloud Guru

Working at VMware also means I have lots of benefits 😌🙏 and one of them is the possibility to access https://acloudguru.com/ courses/resources/labs FOR FREE! ✨
A Cloud Guru is well known to be very expensive but to have one of the best playgrounds out there.

Playground?

Yes, essentially you can just go to their playground section
Open a new session of their sandbox and log in.

From that moment till a bunch of hours, you will have the chance to play with an almost real AWS environment, bill-free. 👀

Developer Associate Course

Of course, A Cloud Guru has its course: https://learn.acloud.guru/course/aws-certified-developer-associate
I took it and I can say that it covers more or less everything you need to know to pass the exam.
Especially I want to call out the 4 mocked exams which have been super useful to get to know the exam env and the questions’ style.

Tutorials Dojo

Another resource I found useful to reinforce my knowledge was https://tutorialsdojo.com/
It’s a website that contains a study path for each certification.
Here the one for the Developer Associate: https://tutorialsdojo.com/aws-certified-developer-associate-exam-guide-study-path-dva-c02/

Mocked Exams

In Tutorials Dojo you can also find mocked exams which as far as I heard are very close to the real ones but I haven’t purchased it so I don’t have a personal opinion or experience on it.

AWS Whitepapers

Listed either on Tutorials Dojo or A Cloud Guru you can find a list of recommended whitepapers from AWS that are worth reading.
I know, they are quite big but I think that reading them once is worth your time.

I also really liked reading about other experiences on /r/AWSCertifications/, it is full of nice advice and great people who can help you out.
There I discovered that there are other recommended resources that I didn’t follow, maybe it’s worth having a look at it. 👀

How did I study?

Every person has their unique approach so there aren’t good or wrong ways to study.

Having a full-time job it’s always complicated to find the time and the energy to study, for this reason, I repeat that you should study only because you want to learn new things, improve yourself and become a better professional.
The certification itself IMHO doesn’t add anything up to your skills-set.
I studied for 2 months more or less, dedicating myself to it almost every day for at least 25 minutes.

Something I would like to highlight here is that I have more than 10 years of experience as a Software Engineer so your experience can be different and you might need more time and resources.

Obsidian

Obsidian is my main tool for taking notes, creating blog posts, scheduling my day, keeping track of everything, and of course: studying.
I switched over to Obsidian from Notion and I will never get back to it.
My main way to study is to take notes about whatever I read/watch and then look at it later on, to memorise better and freshener those concepts.
Spatial repetition works very well for me even tho I’m not very consistent.

Excalidraw

Another amazing tool I’ve been using to memorise better is Excalidraw, an infinite canvas that I used to divide each topic with its information.
It has been very useful to visualise the information that I’ve got from the video course and blog posts.

Worth to mention that Obsidian has an Excalidraw plugin which means you can use all Excalidraw functionalities from your Obsidian instance, having your draws locally.

For me having a visual representation works quite well and it helps to remember better.

Mocked Exams

As mentioned before doing mocked exams IS KEY to pass your exam, it helps you to get familiar with the questions and the timing.

Strategy I adopted

A Cloud Guru gives you 4 exams that you can practice with.

My strategy was:

Take the mocked exam
Review the questions I’ve answered wrongly
Take notes of the wrong answers
Read deeper about that specific topic by going through the AWS documentation
Read again what I did wrong
Re-take the exam

My goal was to pass each mocked exam with at least 80% of correct answers.
I’ve done it multiple times during the 2 months I spent studying.
Don’t focus too much on the specific questions but more on the topics they cover.

Hands-on Lab

A Cloud Guru helps a lot with the hands-on part but it doesn’t mean you can’t open an AWS free-tier account and try by yourself to play with AWS services.
Right now the exam is composed of 65 multi-choice questions so at first it seems not hands-on oriented but that would be the wrong assumption.
Lots of questions are about specific technical details and particular services options that it’s easier to know if you had the chance to play with them and having a hands-on experience is always better considering the nature of this certification.

Final thoughts

Coming to the end of the article, I’d like to say again that I think that the certification per se says nothing about your competencies and skill set.
Yeah, it shows other people that you can stick to a plan, go out of your comfort zone and that you are capable of learning new things.
I’ve been a software engineer for more than 10 years so far, and I can say that having a certification doesn’t mean you are a better professional than those who don’t have it.

Friendly reminder:

Don’t be too hard on yourself, if you don’t pass the exam, if you don’t complete the course or if you drop it after a few months.
Keep trying, keep being motivated thinking about what you are learning more than what you can do with the certification.

Good Luck! 🍀

I’ve got the AWS Certified Developer Associate - Certification

2023-05-16T08:00:00+00:00

🎉 I am thrilled to share that I have recently achieved my AWS Certified Developer – Associate certification! ☁️🧑🏻‍💻

It’s been a fun and rewarding journey, enhancing my expertise in building cloud-native applications with AWS services! ☁️🔭

Safeguarding Software: Embracing Security Design Principles in Software Development

2023-05-04T08:00:00+00:00

In today’s digital landscape, developing software with a security-oriented mindset is no longer an option – it’s a top priority.
I’ve had the opportunity to attend the Secure Software Development Fundamentals Course by the Open Source Security Foundation, and I found it enlightening and a must for passionate Software Engineers.
So today, I’m going to talk about a set of widely recommended Security Design Principles that serve as invaluable rules of thumb for developing software with security-first in mind.
Let’s start! 🚀

Least Privilege

Probably the most well-known principle, I’m talking about the Least Privilege design principle.
This principle revolves around the concept of giving just the required privileges to a specific user/application to operate correctly, which means with the fewest privileges possible.
Following this principle makes unintentional or improper uses of privilege less likely to occur.

A few points to remember 👇

Don’t give a user/application any special privileges if they are not needed.
Always minimize the special privileges a user/program receives.
Give up privileges as soon as they are no longer required.
If it’s not possible to give up privileges, try to limit the time the privilege is active.
Break your application into different modules and give special privileges - if needed - to only a few modules.
Minimize the attack surface.
Validate the input before accepting it.
Sandbox your application, running it in an intentionally restricted environment.
Minimize privileges for files and other resources.

Complete Mediation

This principle is often called the non-bypassability principle. Essentially, it states that every access attempt coming from an external domain should be checked, and especially, don’t act on the data received before validating that the request came from a valid source.
By following this principle, we have thorough and consistent authorization checks at every access point in a software system, protecting data and enhancing the security of our application.

Economy of Mechanism

This is the simplicity principle, also called KISS.
Security and over-engineering are always a dangerous duo. Having a security mechanism with lots of hidden features and intricate components can increase the chances of something going wrong.
The rule to follow this principle is to keep your security mechanism simple. Don’t try to reinvent the wheel or overcomplicate the solution – keep it simple.
A simple system is easier to review, maintain, and test, and harder to get wrong.

Open Design

The Open Design principle is often underestimated, but it’s one of the most powerful.
Essentially, it states that an attacker shouldn’t be able to break into our system just because they know how it works. Relying on the ignorance of the attacker to protect our system is always a big mistake.
We should always act as if the security mechanism is publicly known and depend on the secrecy of a few easily changeable items like credentials.
The opposite of the Open Design principle is called Security Through Obscurity, and there have been multiple documented cases that prove it doesn’t work.
Moreover, having an open design makes extensive public scrutiny possible and gives confidence to any user who knows about the mechanism used that our software is secure.

Fail-Safe Defaults

The rule of this principle is that in situations where a decision or authorization cannot be explicitly determined, the system should default to the most secure option.
Don’t distribute software with an empty or default password; instead, force the user to set it up during the installation process.
How many times have you seen default passwords being used by unaware users? By applying this principle, developers can minimize the risks associated with incomplete or erroneous authorization decisions.

Separation of Privilege

This principle states that access to critical resources or sensitive operations should depend on more than one independent condition. In this way, even if an attacker manages to break one condition, they still need to break the others to compromise the system’s security.
It promotes the distribution of privileges and responsibilities to multiple independent entities, reducing the potential impact of compromised accounts.

Least Common Mechanism

This principle focuses on reducing the amount of shared resources or dependencies between different components of a system. In some cases, sharing can reduce costs, but it increases security risks.
Following this principle leads to having more modular and robust software. For example, we can establish separate database connections for different components or modules instead of using a single, shared one. This approach minimizes the chances of conflicts or bottlenecks and allows for better isolation and scalability.

Psychological Acceptability

This principle is more user-centric. It states that the security mechanism’s user interface must be designed to be user-friendly and simple to use.
If something is hard to use, it is often insecure in practice because users will work around it to make their lives easier.
One easy example is defining rules for passwords; after a few attempts, users will resort to using very simple passwords just to pass all the checks and move on, causing the opposite effect.

Conclusion

These security design principles are not mere theoretical concepts; they provide practical guidelines that can be applied during the software development lifecycle, which, for me, is GOLD.

Of course, these security design principles are guidelines, and there may be good reasons not to apply them in some cases. It is important to think about them wisely and consider the specific context of your software development project.

Remember, security is an ongoing process. We must remain vigilant, continuously assess our systems, and act accordingly.

As software engineers, we have the responsibility to build software that not only meets functional requirements but also prioritizes the protection of user privacy, data integrity, and system reliability.

How to create an always up to date alias for your Mastodon account

2022-11-19T08:00:00+00:00

Mastodon is a new hot-trend topic, so I spent some time trying to wrap my head around it.
The decentralisation is an exciting part of Mastodon; if tomorrow I don’t like the instance where my account resides anymore, I can always switch to another instance and bring all my data seamlessly. It’s fantastic, except that now it’s like having another account with a different address, so I need to share it with my “audience” again and again.
Doing some research, I discover how to create a custom alias for my Mastodon account to have it always pointing to my current account; let’s see how!

How is it look like?

The alias I created for my account is dlion@domenicoluciani.com where dlion is my nickname, and domenicoluciani.com is my custom domain.
If you try to search it on Mastodon, you will find my current account, which resides on mastodon.social

Behind the scene

I discovered that Mastodon uses ActivityPub to communicate between different actors and that those actors are found using WebFinger, a way to attach information to a specific email address or other online resources.
So I just needed to implement the WebFinger spec on my domain to have it working.

On your Mastodon instance, you have an endpoint called .well-known/webfinger, which accepts a query parameter that allows other Mastodon instances to get information around a particular account.

/.well-known/webfinger?resource=acct:@

For instance, in my case, doing a curl GET request to this URL:

https://mastodon.social/.well-known/webfinger?resource=acct:dlion@mastodon.social

I get the WebFinger response for my account:

{
   "aliases" : [
      "https://mastodon.social/@dlion",
      "https://mastodon.social/users/dlion"
   ],
   "links" : [
      {
         "href" : "https://mastodon.social/@dlion",
         "rel" : "http://webfinger.net/rel/profile-page",
         "type" : "text/html"
      },
      {
         "href" : "https://mastodon.social/users/dlion",
         "rel" : "self",
         "type" : "application/activity+json"
      },
      {
         "rel" : "http://ostatus.org/schema/1.0/subscribe",
         "template" : "https://mastodon.social/authorize_interaction?uri={uri}"
      }
   ],
   "subject" : "acct:dlion@mastodon.social"
}

I need to put this response on my server under the same directory and inside the same file and that’s it.

How to do it on a Jekyll website

For my blog, I use GitHub for the hosting; specifically, I use github-pages which means using Jekyll, a static site generator.

To have your Mastodon alias in your custom domain using Jekyll, you need to:

In the root of your repo, create a directory called .well-known.
Inside the directory .well-known, create a new file called webfinger.
Inside the webfinger file, put the response you get when you curl your actual Mastodon instance WebFinger endpoint, as mentioned before.
In your _config.yml, add include: ["/.well-known" ] to include that directory in your rendering.

And it’s done. Just push and wait a bunch of minutes. Your alias will be founded as anything@your-custom-domain.whatever by Mastodon, redirecting everyone to your actual account.

References

If you want to know more about WebFinger, you can have a look at the original website of the spec and at the Mastodon’s documentation:

I found out about this method thanks to this article by Maarten Balliauw:

https://blog.maartenballiauw.be/post/2022/11/05/mastodon-own-donain-without-hosting-server.html

Misleading Pair Programming

2022-07-22T08:00:00+00:00

I often read tweets, threads and posts about a weird practice called Pair Programming, full of complaints. But I find this usually due to a lack of understanding and improper implementation.
I’d like to clarify why those complaints are misleading and why you shouldn’t write pair programming off completely.

It’s tiring!

Do you use pair programming? How do you do it?

It often feels like a more tiring way to move slower to me. Great for solving gnarly bugs together, but not for coding.
— Swizec Teller encouraging you to Be An Expert (@Swizec) February 22, 2021

I read the complaint that 8 hours of pair programming is a nightmare. It is ruining their life, draining all their mental energies, and making them brainless at the end of the day.
I mean, pairing IS tiring, and our job IS tiring, BUT let me tell you a thing, and I want to make it clear:

If you are pair programming for 8 hours straight, you are doing it wrong!

Take breaks!

Pair programming is a powerful and brain energy-consuming activity; therefore, taking breaks IS KEY. But pairs are frequently so focused on the task that they forget until they suddenly realize they’re exhausted.
Here’s one technique I use to remember to take breaks: Pomodoro.
Essentially it forces you to take a break every 25 minutes of straight pair programming. The break usually lasts 5 minutes, during which you shouldn’t do anything related to that specific task. Do something else, step away from screens, look out the window or refill your glass of water.
Of course, you can be flexible according to the needs of you and your pair, but you shouldn’t push too hard on that 25-minute limit. Taking breaks is part of this activity; you and your pair should do it frequently. It’s essential to have breaks and rest. Please do it.

It’s slow!

Another false assertion is that pairing slows down the development process.

Why have two people on the same task when you can have two people on two different tasks and speed up the development?

It seems a fair statement on the surface, but let’s look deeper:

Code Reviews

Doing code reviews is a common practice in multiple realities. Once you are done with your story, another (or sometimes more than just one) person reviews and ultimately accepts the code you wrote to be merged.
What are the pitfalls of this practice?

Lack of context: the person reviewing your code doesn’t have the same context as you do. It can lead to misunderstandings or long debates, delaying the integration and resulting in a longer feedback loop.
It requires another person’s availability: waiting for another person to review your code leads to a significant delay creating a bottleneck within the team.
Context Switch: It’s been proven¹ that context switching requires a lot of time to return to the original task following an interruption. If you, as a solo programmer, are blocked, you create this situation for your coworker who leaves her own task to help unblock you. In contrast, even when a pair is interrupted to help unblock someone else, getting back on track together is much easier.
Fake Code Reviews: reading code written by others is complex. Especially without context, it can become very time-consuming. A typical response to this challenge is scanning the code, acting like a linter instead of truly understanding what has been written and how it will fit into the whole. It is like doing a fake code review, and it doesn’t add any value; in fact, it can lead to new problems in the codebase while giving the team a false feeling of security. It’s easy to fall victim to this practice, especially when the team is in a rush due to imposed deadlines or other time constraints.

Bus Factor

Working alone on a task is cool until someone else on the team needs to know what you have done and how, but you are not available to ask.
Handing your knowledge around that task will require explaining the code you have written, the context and the choices you have made, and hoping they get it quickly.

These are all the issues you can mitigate by working from the start with pair.
This way, two people build the same knowledge base around a specific task. Frequent rotations between pairs help to spread this knowledge increasing the collective ownership of the codebase.
Are you off tomorrow? No worries, the code will continue to be developed while you’re out because the team knows what you know about the problem you’re solving together.

Sometimes I don’t think that pairing is a good idea!

Then don’t do it!

Pairing is a valuable activity on several levels but should be used when it makes sense. Otherwise, it’s just another practice you follow because someone said you should.
If you see a task (i.e. adjusting some documentation) that doesn’t require pairing, do it yourself. It is okay to have a solo moment, and it’s up to you to decide whether or not to pair with someone else.

Remote is harder!

True, remote pairing is more complicated than pairing physically. However, nowadays, we have some tools, practices and equipment to overcome these challenges as much as possible.

Slow internet connection: Ensure an internet connection that seamlessly supports your pairing activities. Without a good internet connection, the whole pairing activity will degrade and become painfully slow and disruptive.
High-quality audio channel: pairing forces you to communicate as much as possible. It’s about sharing your thoughts and thinking out loud so your pair can follow what you’re doing. It’s about discussing and describing what you will do while driving or navigating. During the pairing sessions, communicating well is key. A good headset with a good microphone (with noise reduction) will protect your pair from annoying background noises while you listen to your colleague with in-person fidelity.
High-quality camera: Body language is an essential part of pairing; it helps you understand what the other person is thinking (even unconsciously) and whether that person is following you or not. It will enable you to have a more human centre experience.
Some of these critical factors are missing during the remote pairing session, where we can often see only the head and shoulders of our partner. Keeping the camera on and streaming high-quality video helps get the most out of this limited view. It allows you to see the other person’s expressions well. It helps you to be listened to and forces the other person to not be distracted. Briefly turning the camera off is okay if you need to blow your nose, for instance, or if you have a temporary problem with your internet connection. Be sure to communicate this to your pair. Without the visual cues, be conscious that you will lose lots of communication, making the pairing session harder.
Use the right tools: Don’t stick with a single tool; if one tool works better for screen-sharing and another for video chat, use them. For instance, if you find the Zoom screen sharing pretty bad, try to use Visual Studio Live Code and use Zoom just for the call. Yes, you will use more resources, but it is the right thing to do if it improves your experience. Nowadays, you can try many tools; use the ones you and your pair are most comfortable with.
Breaks breaks breaks!: Pairing is a brain power-consuming activity. If by default you take breaks, when remote, you’ll need them even more. Don’t be shy and ask whenever you need a break to prevent Zoom fatigue or, in general, being exhausted at the end of the day.

I have less experience than my pair

Pairing is not only about showing what you know but also showing what you don’t know.
It’s totally fine saying “I don’t know” during a pairing session. It’s fine to admit you have less experience or are just blocked.
Your pair have to support, unblock and guide you.
Pairing has this colossal benefit that cultivation is part of this activity. It helps juniors grow together by getting more confident, and it allows seniors to learn from less experienced engineers (it happens frequently).
Pairing is about trust and being honest and open with your pair, so don’t be afraid. Learning is a beautiful journey, and pairing is the safest way to do it.

I have more experience than my pair

Having more experience than your pair shouldn’t be a problem; it’s an occasion for mentoring.
The goal is to develop a shared context of the problem and solution. As a more experienced person, adjust your speed so your pair can follow along and ask you questions as necessary.
The pairing session is also an excellent way to clarify whether something is understood; if it can be explained in simple words, then you know it. Think of pairing with a less experienced colleague as a perfect way to improve your speaking and teaching skills.

Yada yada yada

There are probably many other complaints we could consider, but the result will always be the same.
Pair programming brings to your team tons of benefits if it is done correctly.
My advice is to set expectations at the beginning of each session. Agree on what the pairing session will look like and clarify the style together to avoid misaligned assumptions resulting in behaviours that can degrade the experience for both of you.

Happy Pairing!

Other resources

Here there are some resources you can find useful to convince your boss that doing pair programming is the way:

Thanks

I want to thank my colleague Judy for taking the time to read and review this article and for providing me lots of interesting advice and corrections.

Context Switching is hurting your productivity ↩

Domenico Luciani

Create a DNS Resolver with Golang

DNS Resolver what?

The Coding Challenge

Preface

Things I learned with this challenge

Step 0

Step 1

Header

Question

Query

Let’s see the code 👀

Step 2

Step 3

Let’s put everything together

The code and the output?

Final thoughts and thank yous

Create an application layer load balancer with Golang

Application Layer Load Balancer

Functionalities

Goals of this challenge

Preface

First Requirement: Creating a simple request forwarder

Forward Request

Try it out

Second Requirement: Distribute traffic with Round Robin

Try it out

Third Requirement: Implement a Health Check

Try it out

Fighting concurrency

Try it out

Final Thoughts

Create a WC clone with Golang

Intro and Coding Challenge

Coding Challenges

WC

Preface

First Requirement: Count bytes

Second Requirement: Count Lines

Third Requirement: Count Words

Fourth Requirement: Count Characters

Fifth Requirement: Default options

Parameters

Check if we’ve got parameters

Get the filename from the command line

Check if a specific flag has been passed

Flag initialization and parsing

Const variables

Final step

Directory Structure

Final Thoughts

PointerPodcast - Buildpacks and Extreme Programming

Buildpacks - 3 months later

Take 3

What do Buildpacks do?

Cloud Native Buildpacks - The team

How does the Cloud Native Buildpacks Open Source team work?

How a day within the team looked like

Expectations

Golang

First Feature

Open Source Contributor

Back to Labs

Some of my contributions:

Outcome

Become a contributor

Thank you

How I Passed my AWS Certified Developer Exam

But why did you spend your time getting the certification?

Why did I focus on AWS?

Where did I study?

A Cloud Guru

Playground?

Developer Associate Course

Tutorials Dojo

Mocked Exams

AWS Whitepapers

Reddit

How did I study?

Obsidian