Let’s start by describing what is an Application Layer Load Balancer which is important to understand what it does and where are the complexities behind implementing it.
Usually a Load Balancer sits in front of a group of servers and routes client requests across all of the servers that are capable of fulfilling those requests.
Load Balancers ensure that the traffic is equally distributed between our healthy servers minimising the response time.
There are different types of load balancers, they can work at different levels of the OSI. In this case, I’m gonna be focusing on layer seven of the stack, which will route HTTP requests from clients to a pool of HTTP servers.
flowchart TD
subgraph front
Client <--> LoadBalancer
end
subgraph back
LoadBalancer <-.-> Server1 & Server2 & Server3
end
Just a friendly reminder that the process I took can be avoided, improved, and of course, wrong.
I’m just telling my story and my Golang-improving journey through this post.
Feel free to give me feedback about it and, if it makes you learn something new or reflect on a topic you never thought about, just let me know ☀️
First of all, I needed to create a simple server, Golang is very powerful and it allows you to do so in a few steps.
In the main for example, we can have:
http.HandleFunc("/", func(writer http.ResponseWriter, request *http.Request) {
fmt.Printf("Received request from %s\n", request.RemoteAddr)
fmt.Printf("%s / %s\n", request.Method, request.Proto)
fmt.Printf("%s / %s\n", request.Method, request.Proto)
fmt.Printf("Host: %s\n", request.Host)
fmt.Printf("User-Agent: %s\n", request.Header.Get("User-Agent"))
fmt.Printf("Accept: %+v\n\n", request.Header.Get("Accept"))
fmt.Printf("Replied with a hello message\n")
fmt.Fprintf(writer, "Hello From Backend Server")
})
err := http.ListenAndServe(":80", nil)
if err != nil {
log.Fatal("Error listening and serve")
}
And once run we will have our small server listening at the 80
port, logging each request that has been received at the /
endpoint.
To verify it you can just call curl http://localhost/ --output -
having as a result: Hello From Backend Server
.
Of course, it’s not enough to have a service that forwards our requests to specified servers but it’s a start to understand how Golang works.
So to get back to our problem, I started with a unit test:
t.Run("should call the client to forward the request", func(t *testing.T) {
req, _ := http.NewRequest(http.MethodGet, "/", nil)
resp := httptest.NewRecorder()
mockClient := newMockClient()
spartimillu := NewSpartimilluServer(mockClient)
mockClient.On("ForwardRequest", mock.Anything).Return(&http.Response{
Status: "200 OK",
StatusCode: 200,
Proto: "HTTP/1.0",
Body: io.NopCloser(bytes.NewBufferString("dummy body")),
Request: req,
})
spartimillu.ServeHTTP(resp, req)
mockClient.AssertExpectations(t)
assert.Equal(t, "dummy body", resp.Body.String(), "got %q, want %q", resp.Body.String(), "dummy body")
})
With this test:
httptest
package.type MockClient struct {
mock.Mock
}
func newMockClient() *MockClient { return &MockClient{} }
func (m *MockClient) ForwardRequest(req http.Request) *http.Response {
args := m.Called(req)
return args.Get(0).(*http.Response)
}
func (m *MockClient) HealthCheck() {
m.Called()
}
SpartimilluServer
.type SpartimilluServer struct {
client client.Client
}
func NewSpartimilluServer(client client.Client) *SpartimilluServer {
return &SpartimilluServer{client: client}
}
ForwardRequest
function to return a custom response with dummy body
as a body.ServeHttp
.The implementation was quite straightforward:
func (s *SpartimilluServer) ServeHTTP(w http.ResponseWriter, r *http.Request) {
fmt.Printf("Received request from %s\n", r.RemoteAddr)
fmt.Printf("%s / %s\n", r.Method, r.Proto)
fmt.Printf("%s / %s\n", r.Method, r.Proto)
fmt.Printf("Host: %s\n", r.Host)
fmt.Printf("User-Agent: %s\n", r.Header.Get("User-Agent"))
fmt.Printf("Accept: %+v\n", r.Header.Get("Accept"))
resp := s.client.ForwardRequest(*r)
fmt.Printf("Response from server: %s %s\n\n", resp.Proto, resp.Status)
body, err := io.ReadAll(resp.Body)
if err != nil {
http.Error(w, "Error reading the response body", http.StatusInternalServerError)
}
stringBody := string(body)
fmt.Fprint(w, stringBody)
fmt.Println(stringBody)
}
Now we need to implement our client and its functionalities.
Let’s start with an integration test!
t.Run("should forward a GET request to a specific server", func(t *testing.T) {
server, address := startTestServer(t, "ok")
defer server.Close()
conf := NewSpartimilluClientConf([]string{address})
client := NewSpartimilluClient(conf)
req := httptest.NewRequest(http.MethodGet, "/", nil)
resp := client.ForwardRequest(*req)
body := getBodyFromResp(t, resp)
assert.Equal(t, http.MethodGet, resp.Request.Method, "got %v, wanted %v", resp.Request.Method, http.MethodGet)
assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)
assert.Equal(t, "ok", body, "got %v, wanted %v", body, "ok")
})
So let’s see what is happening here:
httptest
package and defer it with the Close
method:func startTestServer(t *testing.T, bodyResponse string) (*httptest.Server, string) {
t.Helper()
server := httptest.NewServer(http.HandlerFunc(func(writer http.ResponseWriter, request *http.Request) {
fmt.Printf("%s has been called\n", bodyResponse)
fmt.Fprint(writer, bodyResponse)
}))
return server, server.URL
}
As you can see we are using the NewServer
method to spawn up a new server and then set a handler function to get back some info.
SpartimilluClientConf
struct:type SpartimilluClientConf struct {
addresses []string
}
func NewSpartimilluClientConf(addresses []string) SpartimilluClientConf {
return SpartimilluClientConf{addresses: addresses}
}
As you can see it just contains some info like addresses and the health-check endpoint (we will get there later).
ForwardRequest
method passing the request I created.func getBodyFromResp(t *testing.T, resp *http.Response) string {
t.Helper()
bodyBytes, err := io.ReadAll(resp.Body)
assert.Nil(t, err)
return string(bodyBytes)
}
To then jump into the implementation:
type Client interface {
ForwardRequest(req http.Request) *http.Response
}
type SpartimilluClient struct {
conf SpartimilluClientConf
}
func NewSpartimilluClient(conf SpartimilluClientConf) *SpartimilluClient {
return &SpartimilluClient{conf: conf}
}
func (s *SpartimilluClient) ForwardRequest(req http.Request) *http.Response {
switch req.Method {
case http.MethodGet:
return sendGetRequestToAnotherServer(s.conf.address + req.RequestURI)
}
return nil
}
func sendGetRequestToAnotherServer(url string) *http.Response {
body, err := http.Get(url)
if err != nil {
log.Fatal("Can't read the response body from the GET request")
}
return body
}
As you can see I just called the http.Get(url)
method emulating the GET
request we got.
To see everything in action we can just call our main:
func main() {
spartimilluClient := client.NewSpartimilluClient(client.NewSpartimilluClientConf("http://localhost:8080"))
spartimilluServer := server.NewSpartimilluServer(spartimilluClient)
log.Fatal(http.ListenAndServe(":80", spartimilluServer))
}
8080
and provided it into the configuration.SpartimilluServer
function handler and used it for my ListenAndServe
function.To spawn up my little server you can use a different main with the code you saw before or just create a directory called for example server8080
containing an index.html
file with this content:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Index Page</title>
</head>
<body>
Hello from the web server running on port 8080.
</body>
</html>
and run in your terminal: python -m http.server 8080 --directory server8080
, it will spawn up a python server serving the content of the directory directory8080
.
So then you can just call your young load balancer: curl http://localhost/ --output -
.
The result will be:
❯ curl http://localhost/ --output -
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Index Page</title>
</head>
<body>
Hello from the web server running on port 8080.
</body>
</html>
flowchart TD
subgraph front
CURL --"1. GET index.html"--> SpartimilluServer
SpartimilluServer --"4. index.html"--> CURL
end
subgraph back
SpartimilluServer-. "2. GET index.html" .-> PythonServer8080
PythonServer8080-. "3. index.html" .-> SpartimilluServer
end
Now that we have our “forwarder” in place, we have to distribute the incoming requests using a scheduling algorithm called “Round Robin”.
It’s quite simple, we just need to distribute the traffic to each server in the list, one after the other and once forwarded to all of them we start back at the beginning of the list.
For example:
Server | Request |
---|---|
A | 1, 4 |
B | 2, 5 |
C | 3, 6 |
Let’s start with another integration test:
t.Run("should forward a GET request to any server using a round robin algorithm", func(t *testing.T) {
server1, address1 := startTestServer(t, "server1")
defer server1.Close()
server2, address2 := startTestServer(t, "server2")
defer server2.Close()
server3, address3 := startTestServer(t, "server3")
defer server3.Close()
conf := NewSpartimilluClientConf([]string{address1, address2, address3})
client := NewSpartimilluClient(conf)
req := httptest.NewRequest(http.MethodGet, "/", nil)
resp := client.ForwardRequest(*req)
body := getBodyFromResp(t, resp)
assert.Equal(t, http.MethodGet, resp.Request.Method, "got %v, wanted %v", resp.Request.Method, http.MethodGet)
assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)
assert.Equal(t, "server1", body, "got %v, wanted %v", body, "server1")
resp = client.ForwardRequest(*req)
body = getBodyFromResp(t, resp)
assert.Equal(t, http.MethodGet, resp.Request.Method, "got %v, wanted %v", resp.Request.Method, http.MethodGet)
assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)
assert.Equal(t, "server2", body, "got %v, wanted %v", body, "server2")
resp = client.ForwardRequest(*req)
body = getBodyFromResp(t, resp)
assert.Equal(t, http.MethodGet, resp.Request.Method, "got %v, wanted %v", resp.Request.Method, http.MethodGet)
assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)
assert.Equal(t, "server3", body, "got %v, wanted %v", body, "server3")
})
We spawn up 3 servers and set their address in our configuration.
sever1
server2
.server3
.Let’s jump into the implementation:
type SpartimilluClient struct {
conf SpartimilluClientConf
counter int
}
func (s *SpartimilluClient) ForwardRequest(req http.Request) *http.Response {
var resp *http.Response
serverIndex := s.counter % len(s.conf.addresses)
switch req.Method {
case http.MethodGet:
resp = sendGetRequestToAnotherServer(s.conf.addresses[serverIndex] + req.RequestURI)
}
s.counter++
return resp
}
Using a counter counter
and the module operator I implemented a simple round-robin algorithm.
Let’s see how it works:
counter | operation | serverIndex |
---|---|---|
0 | 0 % 3 | 0 |
1 | 1 % 3 | 1 |
2 | 2 % 3 | 2 |
3 | 3 % 3 | 0 |
4 | 4 % 3 | 1 |
5 | 5 % 3 | 2 |
6 | 6 % 3 | 0 |
… | … | … |
And so on. 🤯
Our main should look like this:
func main() {
spartimilluClient := client.NewSpartimilluClient(client.NewSpartimilluClientConf([]string{
"http://localhost:8080",
"http://localhost:8081",
}))
spartimilluServer := server.NewSpartimilluServer(spartimilluClient)
log.Fatal(http.ListenAndServe(":80", spartimilluServer))
}
As you can see we specified 2 addresses in our configuration, a server listening at 8080
and one listening at 8081
.
Of course, before starting our load balancer we should spawn up our servers.
Let’s create a directory (as we did before for the server8080
) but this time called server8081
inside an index.html
containing something similar:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Index Page</title>
</head>
<body>
Hello from the web server running on port 8081.
</body>
</html>
Then we can just run these commands in 2 different shells:
python -m http.server 8080 --directory server8080
python -m http.server 8081 --directory server8081
Once both servers are up we can test it out just executing our main and calling our load balancer three times to see how it works:
❯ curl http://localhost/ --output -
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Index Page</title>
</head>
<body>
Hello from the web server running on port 8080.
</body>
</html>
and
❯ curl http://localhost/ --output -
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Index Page</title>
</head>
<body>
Hello from the web server running on port 8081.
</body>
</html>
and
❯ curl http://localhost/ --output -
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Index Page</title>
</head>
<body>
Hello from the web server running on port 8080.
</body>
</html>
flowchart TD
subgraph front
CURL --"1. GET index.html"--> SpartimilluServer
SpartimilluServer --"4/6. index.html"--> CURL
end
subgraph back
SpartimilluServer-. "2. GET index.html" .-> PythonServer8080
PythonServer8080-. "3. index.html" .-> SpartimilluServer
SpartimilluServer-. "4. GET index.html" .-> PythonServer8081
PythonServer8081-. "5. index.html" .-> SpartimilluServer
end
Now that we implemented the main functionality we have to implement the health check that helps us to always forward the request to a live server.
Let’s start with a unit test:
t.Run("should call the client to do an health check", func(t *testing.T) {
req, _ := http.NewRequest(http.MethodGet, "/healthcheck", nil)
resp := httptest.NewRecorder()
mockClient := newMockClient()
spartimillu := NewSpartimilluServer(mockClient)
mockClient.On("HealthCheck").Return(&http.Response{
Status: "200 OK",
StatusCode: 200,
Proto: "HTTP/1.0",
Request: req,
})
spartimillu.HealthCheck()
mockClient.AssertExpectations(t)
assert.Equal(t, http.StatusOK, resp.Code, "got %q, want %q", resp.Code, http.StatusOK)
})
type MockClient struct {
mock.Mock
}
func newMockClient() *MockClient { return &MockClient{} }
func (m *MockClient) ForwardRequest(req http.Request) *http.Response {
args := m.Called(req)
return args.Get(0).(*http.Response)
}
func (m *MockClient) HealthCheck() {
m.Called()
}
As we have done before we are checking that the method HealthCheck
has been implemented correctly in our SpartimilluServer
.
With a very simple implementation:
func (s *SpartimilluServer) HealthCheck() {
fmt.Printf("Performing Health Check\n")
s.client.HealthCheck()
}
This method would be called every N seconds to check if our servers are still alive.
As usual, let’s continue with an integration test for our client:
t.Run("should perform a health check towards a server", func(t *testing.T) {
server1, address1 := startTestServer("server1")
defer server1.Close()
server2, address2 := startTestServer("server2")
defer server2.Close()
server3, address3 := startTestServer("server3")
defer server3.Close()
conf := NewSpartimilluClientConf([]string{address1, address2, address3}, "/healthcheck")
client := NewSpartimilluClient(conf)
req := httptest.NewRequest(http.MethodGet, "/", nil)
client.HealthCheck()
resp := client.ForwardRequest(*req)
assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)
assert.Equal(t, server1.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server1.URL)
resp = client.ForwardRequest(*req)
assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)
assert.Equal(t, server2.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server2.URL)
resp = client.ForwardRequest(*req)
assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)
assert.Equal(t, server3.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server3.URL)
server1.Close()
client.HealthCheck()
resp = client.ForwardRequest(*req)
assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)
assert.Equal(t, server2.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server2.URL)
server2.Close()
client.HealthCheck()
resp = client.ForwardRequest(*req)
assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)
assert.Equal(t, server3.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server3.URL)
resp = client.ForwardRequest(*req)
assert.Equal(t, http.StatusOK, resp.StatusCode, "got %v, wanted %v", resp.StatusCode, http.StatusOK)
assert.Equal(t, server3.URL, resp.Request.URL.Scheme+"://"+resp.Request.Host, "got %v, wanted %v", resp.Request.URL.Scheme+"://"+resp.Request.Host, server3.URL)
})
Let’s see what we did:
HealthCheck
method that supposed to update our list of available serversForwardRequest
method checking then if we contact the right server.server1
ForwardRequest
method but this time our method should call the server2
since the server1 has been shut down.server2
, expecting then to call the only one left server3
.Here is the implementation:
type Client interface {
ForwardRequest(req http.Request) *http.Response
HealthCheck()
}
type SpartimilluClient struct {
conf SpartimilluClientConf
counter int
healthyServers map[string]bool
}
We added to our struct a map of healthyServers that will be updated by our HealthCheck
method.
func NewSpartimilluClient(conf SpartimilluClientConf) *SpartimilluClient {
return &SpartimilluClient{conf: conf, healthyServers: make(map[string]bool)}
}
func (s *SpartimilluClient) HealthCheck() {
for _, address := range s.conf.addresses {
resp, err := http.Get(address)
if err == nil && resp.StatusCode == http.StatusOK {
s.healthyServers[address] = true
} else {
s.healthyServers[address] = false
}
}
}
We iterate over the list of addresses and for each server we do a GET request to check if we got an OK, if so we update our healthyServers map having as a key the address and a boolean value as a value.
And this is the implementation of our ForwardRequest
method:
func (s *SpartimilluClient) ForwardRequest(req http.Request) *http.Response {
if len(s.healthyServers) == 0 {
s.HealthCheck()
}
index := s.counter % len(s.conf.addresses)
address := s.conf.addresses[index]
s.counter++
if s.healthyServers[address]{
switch req.Method {
case http.MethodGet:
return sendGetRequestToAnotherServer(address + req.RequestURI)
}
}
return s.ForwardRequest(req)
}
Here we make sure that we perform a health check at least once before we start forwarding requests around.
Then we check if the server we want to contact is healthy then we contact it otherwise we do a recursive call to call the next one in the list.
The recursion is not very efficient but it works nicely for now, we will improve it in the next step.
Our main now looks like this:
func main() {
spartimilluClient := client.NewSpartimilluClient(client.NewSpartimilluClientConf([]string{
"http://localhost:8080",
"http://localhost:8081",
}, "/healthcheck"))
spartimilluServer := server.NewSpartimilluServer(spartimilluClient)
ticker := time.NewTicker(5 * time.Second)
go func() {
for {
select {
case <-ticker.C:
spartimilluServer.HealthCheck()
}
}
}()
log.Fatal(http.ListenAndServe(":80", spartimilluServer))
}
This is the first raw version of our async HealtCheck.
We call asynchronously our HealthCheck
every 5 seconds using a ticker
.
To try it out we can reproduce the steps above spawning our 2 stubbed servers and once we run our load-balancer we can just kill one of the two to check if our load-balancer every 5 seconds decides to contact the only one that is still alive.
flowchart TD
subgraph front
CURL --"1. GET index.html"--> SpartimilluServer
SpartimilluServer --"4. index.html"--> CURL
end
subgraph back
PythonServer8080
SpartimilluServer-. "2. GET index.html" .-> PythonServer8081
PythonServer8081-. "3. index.html" .-> SpartimilluServer
end
Another thing we have to make sure to handle is the concurrency.
As you could have seen our HealthCheck
function is modifying a shared map with ForwardRequest
, and it can cause concurrency issues since it can be accessed by both functions at the same time. We can do that using a mutex.
type SpartimilluClient struct {
conf SpartimilluClientConf
counter int
healthyServers map[string]bool
mu sync.Mutex
}
Here is our HealthCheck
implementation:
func (s *SpartimilluClient) HealthCheck() {
for _, address := range s.conf.addresses {
resp, err := http.Get(address)
s.mu.Lock()
if err == nil && resp.StatusCode == http.StatusOK {
s.healthyServers[address] = true
} else {
s.healthyServers[address] = false
}
s.mu.Unlock()
}
}
Every time we want to access our healthyServers
map we lock it down to be sure that anybody else can do it before releasing it back.
And the ForwardRequest
one:
func (s *SpartimilluClient) ForwardRequest(req http.Request) *http.Response {
for {
if len(s.healthyServers) == 0 {
s.HealthCheck()
}
s.mu.Lock()
index := s.counter % len(s.conf.addresses)
address := s.conf.addresses[index]
s.counter++
if s.healthyServers[address] == true {
s.mu.Unlock()
switch req.Method {
case http.MethodGet:
return sendGetRequestToAnotherServer(address + req.RequestURI)
}
}
s.mu.Unlock()
time.Sleep(100 * time.Millisecond)
}
}
Here I did a bit of refactoring and of course, used the Lock
to handle concurrency issues. I also removed the recurse for a for/infinite loop with a sleep time to re-try contacting our servers.
After a refactoring our main looks like this:
func main() {
const seconds = 1 * time.Second
spartimilluClient := client.NewSpartimilluClient(client.NewSpartimilluClientConf([]string{
"http://localhost:8080",
"http://localhost:8081",
}, "/healthcheck"))
spartimilluServer := server.NewSpartimilluServer(spartimilluClient)
go doEvery(seconds, spartimilluServer.HealthCheck)
log.Fatal(http.ListenAndServe(":80", spartimilluServer))
}
func doEvery(d time.Duration, f func()) {
ticker := time.Tick(d)
for range ticker {
go f()
}
}
A this time you should have your load balancer switching from one server to another and performing health checks correctly.
It has been a quite fun challenge, iteratively I built up my application load balancer starting from a small forward to then adding a more complex logic facing up some nice challenges like evolving my code to embrace the change and how to do integration tests spawning-up stub servers.
Of course, this is a very basic load balancer, it can be improved and extended but I’m satisfied with it for now.
You can find the repository with the code in my Github profile, https://github.com/dlion/spartimillu.
What do you think about my solution? Any feedback would be appreciated and of course, if you make your solution don’t be shy and share it with me too!
Happy Coding!
]]>Golang has always been one of my favorite languages and I’ve been using it for a few months so far, after my Take 3 experience I decided to keep studying it during my free time, and since I like hands-on projects, I used it to create a side-pet-challenge/project.
I recently discovered Coding Challenges, a website full of hands-on coding challenges that’s possible to take in different languages, I chose mine: Go.
Sometimes I struggle to find new ideas and this website helped me a lot with that.
To start with something simple I decided to implement WC, the famous Word Count unix tool.
To learn more about it, run man wc
in your terminal but what it essentially does is count words, lines, characters, and bytes of a specific file or pipe stream.
From this very high-level point of view, it seems quite simple but digging deeper you will see that it’s not as simple as you could have thought at the beginning.
Just a friendly reminder that the process I took can be avoided, improved, and of course, wrong.
I’m just telling my story through this project and my Golang improving journey.
Feel free to give me feedback about it and, if it makes you learn something new or reflect on a topic you never thought about, just let me know ☀️
Starting from scratch in Go is quite simple, so I just created my repo, opened my IntelliJ Goland IDE, and, created a simple hello world ready to jump into my first requirement implementation.
The first requirement is to have a small functionality, just counting the number of bytes from a specific file.
Using the file that has been provided the result of this command should be:
>./gowc -c test.txt
342190 test.txt
Let’s see what we got from that:
Input:
-c
parameter which is the way we activate
the count bytes functionalitytest.txt
which is the file we want to count from
Output:Through my repo’s commit you can see the history of my changes, I started with something completely different (like mocking a filesystem using testify/mock
) ending up with a bunch of simple unit tests:
func TestWcBytesReader(t *testing.T) {
t.Run("Count reads 0 bytes", func(t *testing.T) {
dummyContent := make([]byte, 0)
r := NewWcBytesReader()
currentBytes := r.Count(dummyContent)
expected := int64(0)
assert.Equal(t, expected, currentBytes, "Got %d, wanted %d", currentBytes, expected)
})
t.Run("Count reads 1 byte", func(t *testing.T) {
dummyContent := make([]byte, 1)
r := NewWcBytesReader()
currentBytes := r.Count(dummyContent)
expected := int64(1)
assert.Equal(t, expected, currentBytes, "Got %d, wanted %d", currentBytes, expected)
})
t.Run("Count reads multiple bytes", func(t *testing.T) {
dummyContent := make([]byte, 100)
r := NewWcBytesReader()
currentBytes := r.Count(dummyContent)
expected := int64(100)
assert.Equal(t, expected, currentBytes, "Got %d, wanted %d", currentBytes, expected)
})
}
The implementation as you can imagine wasn’t a big deal for this feature:
func (w WcBytesReader) Count(content []byte) int64 {
return int64(len(content))
}
We will skip for now how I used it in the main, if you want to try out the parameter part you can just use flag
and parse that calling the Count function from there once NewWcBytesReader
has been called.
The second requirement was to support the command line option -l
that outputs the number of lines in a file.
The CLI input/output should be:
>./gocw -l test.txt
7145 test.txt
Let’s see what we got from that:
Input:
-l
parameter which is the way we activate
the count bytes functionalitytest.txt
which is the file we want to count from
Output:Here are some unit tests I wrote:
t.Run("Count returns 0 lines with an empty file", func(t *testing.T) {
dummyFile := []byte("")
r := NewWcLinesReader()
currentLines := r.Count(dummyFile)
expected := int64(0)
assert.Equal(t, expected, currentLines, "Got %d, wanted %d", currentLines, expected)
})
t.Run("Count returns 1 lines with just one line", func(t *testing.T) {
dummyFile := []byte("Dummy String")
r := NewWcLinesReader()
currentLines := r.Count(dummyFile)
expected := int64(1)
assert.Equal(t, expected, currentLines, "Got %d, wanted %d", currentLines, expected)
})
t.Run("Count returns 3 lines with a multi lines file content", func(t *testing.T) {
dummyFile := []byte("Line 1\nLine 2\nLine 3")
r := NewWcLinesReader()
currentLines := r.Count(dummyFile)
expected := int64(3)
assert.Equal(t, expected, currentLines, "Got %d, wanted %d", currentLines, expected)
})
t.Run("Count returns 3 lines with a multi lines file content with a trailing empty line", func(t *testing.T) {
dummyFile := []byte("Line 1\nLine 2\nLine 3\n")
r := NewWcLinesReader()
currentLines := r.Count(dummyFile)
expected := int64(3)
assert.Equal(t, expected, currentLines, "Got %d, wanted %d", currentLines, expected)
})
And here is the implementation:
func (w WcLinesReader) Count(content []byte) int64 {
if len(content) == 0 {
return int64(0)
}
lines := strings.Split(string(content), "\n")
if lines[len(lines)-1] == "" {
return int64(len(lines) - 1)
}
return int64(len(lines))
}
The third requirement was to support the command line option -w
that outputs the number of words in a file.
The CLI input/output should be:
>./gocw -w test.txt
58164 test.txt
Let’s see what we got from that:
Input:
-w
parameter which is the way we activate
the count words functionalitytest.txt
which is the file we want to count from
Output:The unit tests I wrote:
t.Run("Count returns 0 if the file doesn't have words", func(t *testing.T) {
dummyFile := []byte("")
r := NewWcWordsReader()
nWords := r.Count(dummyFile)
expected := int64(0)
assert.Equal(t, expected, nWords, "Got %d, wanted %d", nWords, expected)
})
t.Run("Count returns 1 if the file have just 1 word", func(t *testing.T) {
dummyFile := []byte("Dummy")
r := NewWcWordsReader()
nWords := r.Count(dummyFile)
expected := int64(1)
assert.Equal(t, expected, nWords, "Got %d, wanted %d", nWords, expected)
})
t.Run("Count returns 3 if the file have 3 words", func(t *testing.T) {
dummyFile := []byte("Dummy Word Here")
r := NewWcWordsReader()
nWords := r.Count(dummyFile)
expected := int64(3)
assert.Equal(t, expected, nWords, "Got %d, wanted %d", nWords, expected)
})
As you can see I maintained always the same style, starting with a simpler scenario, and moving up to a more complex one.
The implementation:
func (w WcWordsReader) Count(content []byte) int64 {
words := strings.Fields(string(content))
return int64(len(words))
}
Here instead of spending lots of time understanding what type of words I want to support, how Golang interprets and counts them, considering corner cases, Unicode characters, etc. I decided to use the strings.Fields
method, according to the doc:
Fields splits the string s around each instance of one or more consecutive white space characters, as defined by unicode.IsSpace, returning a slice of substrings of s or an empty slice if s contains only white space. – Golang Doc
My goal wasn’t to reinvent the wheel, and in some contexts/domains (i.e. security) you shouldn’t too.
The fourth requirement was to support the command line option -m
that outputs the number of characters in a file.
The CLI input/output should be:
>./gocw -m test.txt
339292 test.txt
Let’s see what we got from that:
Input:
-m
parameter which is the way we activate
the count chars functionalitytest.txt
which is the file we want to count from
Output:The unit tests I wrote:
func TestWcCharsReader(t *testing.T) {
t.Run("Count reads 0 chars", func(t *testing.T) {
dummyFile := []byte("")
r := NewWcCharsReader()
nChars := r.Count(dummyFile)
expected := int64(0)
assert.Equal(t, expected, nChars, "Got %d, wanted %d", nChars, expected)
})
t.Run("Count reads 1 char", func(t *testing.T) {
dummyFile := []byte("a")
r := NewWcCharsReader()
nChars := r.Count(dummyFile)
expected := int64(1)
assert.Equal(t, expected, nChars, "Got %d, wanted %d", nChars, expected)
})
t.Run("Count reads multiple chars", func(t *testing.T) {
dummyFile := []byte("abc")
r := NewWcCharsReader()
nChars := r.Count(dummyFile)
expected := int64(3)
assert.Equal(t, expected, nChars, "Got %d, wanted %d", nChars, expected)
})
t.Run("Count reads multiple chars included unicode ones", func(t *testing.T) {
dummyFile := []byte("🚀")
r := NewWcCharsReader()
nChars := r.Count(dummyFile)
expected := int64(1)
assert.Equal(t, expected, nChars, "Got %d, wanted %d", nChars, expected)
})
}
I still used the same format as before, the last test is different since it allows us to test Unicode characters (in this case an emoji). As you might know, Unicode characters are counted differently, if you want to know more about it, read this article: https://tonsky.me/blog/unicode/
And the following implementation:
func (w WcCharsReader) Count(content []byte) int64 {
return int64(utf8.RuneCount(content))
}
The utf8.RuneCount
method allows me to count the number of runes in a string considering utf-8s also.
RuneCount returns the number of runes in p. Erroneous and short encodings are treated as single runes of width 1 byte. – Golang Doc
In this step, we should support the default option which means: no options have been provided which will be translated as we activate the -c
, -l
, and -w
options.
The CLI input/output should be then:
>./gocw test.txt
7145 58164 342190 test.txt
Since we have already implemented all functionalities we just need to rearrange the way we activate them.
At first look, it seems that having something like:
if *flagBytes == true { ... }
else if *flagLines == true { ... }
else if *flagWords == true { ... }
else { activeDefaultOptions() }
Might work fine but I wanted to improve it a bit, I didn’t like the idea that if I wanted to add new functionality I needed to touch/duplicate lots of code.
In the beginning, I got confused and I thought that each option needed to have a filename attached having something like this -c filename.txt -w filename.txt
, but then I realized that my solution would lead to a very complex solution since flag
doesn’t support “empty” flags, if not passing some default values which in case of strings would have been difficult.
So I reverted my design choice to a simpler one, using boolean flags instead.
I don’t like having everything in the main, so I created a parameters
dir and wrote some unit tests:
func TestParameters(t *testing.T) {
t.Run("Parameters have been provided", func(t *testing.T) {
os.Args = []string{"wc", "-l", "text.txt"}
actual := HasProvided()
assert.Truef(t, actual, "expected %t, got %t", true, actual)
})
t.Run("Parameters haven't been provided", func(t *testing.T) {
os.Args = []string{"wc"}
actual := HasProvided()
assert.Falsef(t, actual, "expected %t, got %t", false, actual)
})
t.Run("Get filename from parameter provided", func(t *testing.T) {
os.Args = []string{"wc", "-l", "text.txt"}
actual := GetFilename()
expected := "text.txt"
assert.Equal(t, expected, actual, "expected %t, got %t", expected, actual)
})
t.Run("Get true if at least one flag has been passed", func(t *testing.T) {
getBooleanPointer := func(b bool) *bool { return &b }
flags := map[string]*bool{
"c": getBooleanPointer(false),
"d": getBooleanPointer(true),
"e": getBooleanPointer(false),
}
actualName, actualBool := HaveBeenPassed(flags)
expectedName := "d"
assert.Equal(t, expectedName, actualName, "expected %t, got %t", expectedName, actualName)
assert.Truef(t, actualBool, "expected %t, got %t", true, actualBool)
})
t.Run("Get false if no flags have been passed", func(t *testing.T) {
getBooleanPointer := func(b bool) *bool { return &b }
flags := map[string]*bool{
"c": getBooleanPointer(false),
"d": getBooleanPointer(false),
"e": getBooleanPointer(false),
}
actualName, actualBool := HaveBeenPassed(flags)
expectedName := ""
assert.Equal(t, expectedName, actualName, "expected %t, got %t", expectedName, actualName)
assert.Falsef(t, actualBool, "expected %t, got %t", true, actualBool)
})
}
The functions I implemented are:
func HasProvided() bool {
return len(os.Args) > 1
}
It gives to me if parameters have been provided.
func GetFilename() string {
return os.Args[len(os.Args)-1]
}
It gives me the last parameter entry which should be the filename.
func HaveBeenPassed(flags map[string]*bool) (string, bool) {
for flagName, flagValue := range flags {
if *flagValue == true {
return flagName, true
}
} return "", false
}
It gives me the first flag that has been activated.
To initialize and get the flags I also wrote a function:
func GetFlags() map[string]*bool {
flags := map[string]*bool{
BytesFlag: flag.Bool(BytesFlag, false, "Count bytes of the file"),
LinesFlag: flag.Bool(LinesFlag, false, "Count lines of the file"),
WordsFlag: flag.Bool(WordsFlag, false, "Count words of the file"),
CharsFlag: flag.Bool(CharsFlag, false, "Count chars of the file"),
}
flag.Parse()
return flags
}
It creates a map with strings as key (specific const variables), and a bool which is the boolean value that flag
sets following the CLI parameters.
It refers to the const variables, which are the keys of the GetFlags
map and our parameters:
const (
BytesFlag = "c"
LinesFlag = "l"
WordsFlag = "w"
CharsFlag = "m"
)
So if tomorrow we need to add a new parameter we can just add a new Constant, add it to our map with the respective flag.Bool
call, and everything is encapsulated inside the parameters.go
file.
So going back to our main we have to get the input combining the functions above:
func getInput() ([]byte, string) {
if parameters.HasProvided() {
filename := parameters.GetFilename()
return readFile(filename), filename
}
return make([]byte, 0), EmptyString
}
I created a private function to verify if some parameters have been passed, and then I got the file name and read it, returning the content.
Then after setting and getting the flags using the function parameters.GetFlags()
I initialized my readers using the function reader.InitializeReaders()
which instantiates all readers storing them into a map of strings-WcReaderManager
:
func InitializeReaders() map[string]WcReaderManager {
return map[string]WcReaderManager{
parameters.BytesFlag: bytesReader.NewWcBytesReader(),
parameters.LinesFlag: linesReader.NewWcLinesReader(),
parameters.WordsFlag: wordsReader.NewWcWordsReader(),
parameters.CharsFlag: charsReader.NewWcCharsReader(),
}
}
Once initialized I verify, calling the function parameters.HaveBeenPassed(flags)
if any parameter has been passed.
reader.CountWithSpecificReader(initializedReaders[flagNamePassed], input)
which the implementation is:
func CountWithSpecificReader(specificReader WcReaderManager, input []byte) int64 {
return specificReader.Count(input)
}
It gets a specificReader due to the flag that has been passed and the input which is the content of the file.
It calls the function Count
returning the output.
reader.CountBytesWordsAndLines(initializedReaders, input)
which uses the initialized readers that have been saved into the map to count the input for the 3 default options: bytes, words, and lines. The implementation is:
func CountBytesWordsAndLines(readers map[string]WcReaderManager, input []byte) (int64, int64, int64) {
return readers[parameters.BytesFlag].Count(input),
readers[parameters.WordsFlag].Count(input),
readers[parameters.LinesFlag].Count(input)
}
I was able to accomplish that thanks to the WcReaderManager
interface:
type WcReaderManager interface {
Count(content []byte) int64
}
If you are curious about the approach I used, have a look into the Strategy Pattern.
The final step is about supporting reading from standard input if no filename is specified.
The CLI input/output should be:
>cat test.txt | ./gocw -l
7145
To do that I created a directory called pipeline
, containing useful functions to solve the problem, here are the unit tests I wrote:
func TestPipeline(t *testing.T) {
t.Run("HasInput should return false if an input hasn't come from pipeline", func(t *testing.T) {
actual := HasInput()
assert.Falsef(t, actual, "expected %t, got %t", true, actual)
})
t.Run("HasInput should truw if an input comes from pipeline", func(t *testing.T) {
r, w, _ := os.Pipe()
_, _ = w.Write([]byte("Hello"))
_ = w.Close()
os.Stdin = r
defer func(v *os.File) { os.Stdin = v }(os.Stdin)
actual := HasInput()
assert.Truef(t, actual, "expected %t, got %t", true, actual)
})
}
And here is the implementation:
func HasInput() bool {
f, _ := os.Stdin.Stat()
return (f.Mode() & os.ModeCharDevice) == 0
}
func ReadInput() []byte {
input, err := io.ReadAll(os.Stdin)
if err != nil {
log.Fatalf("Error reading the pipeline: %v", err)
} return input
}
I check if the information that we get from the standard input is coming from a pipe operator, and then I just updated the getInput
function used before:
func getInput() ([]byte, string) {
const EmptyString = ""
if pipeline.HasInput() {
return pipeline.ReadInput(), EmptyString
}
if parameters.HasProvided() {
filename := parameters.GetFilename()
return readFile(filename), filename
}
return make([]byte, 0), EmptyString
}
func readFile(filename string) []byte {
input, err := os.ReadFile(filename)
if err != nil {
log.Fatalf("Error reading the file: %v", err)
} return input
}
- gowc
- parameters
- pipeline
- reader
- bytes
- chars
- lines
- words
- testdata
parameters
directory I’ve put everything related to parameters, which means the definition of the parameters and some helpers.pipeline
directory I’ve put everything about the way to pass information through the pipe operator and the standard input, like identifying when it happens and how to read from itreader
directory I have everything related to my readers, the main reader file contains the interface WcReaderManager
which has a Count
function, and some helpers to initialize the readers.testdata
directory was just a place where to store the sample file.It required a bit of refactoring to get into this shape, inside the reader
directory there is the core of my application beginning with the reader.go
file, it contains the WcReaderManager
interface and a bunch of function which help me to initialize and call in specific ways my readers.
Under each bytes
, chars
, lines
, and words
directory there is the actual implementation that will be executed when needed.
As you can see it helps me to isolate and make it clear what each function belongs to. Having calls like parameters.HasProvided()
, pipeline.HasInput()
, reader.CountWithSpecificReader(...)
really improve the reading and the understanding, I like this structure for this reason.
Of course the development process wasn’t this smooth, there were trials and errors here and there, as it’s supposed to be.
Perfect is the enemy of good enough
I tried to face this coding challenge by reading one requirement after the other and doing a step-by-step evolution in my codebase.
It means that I’ve had to change and adapt my code to the new requirements.
Yeah, I faced this challenge like that it was a real scenario and this is the way I think is the best way to learn: on the job.
In the beginning, it seemed quite simple, I just needed to do some counting here and there, but then at every green test, I felt the urge to clean and refactor my codebase.
Friendly reminder, refactoring should be part of your definition of done.
With each iteration, my codebase evolved into something more clear and thanks to my testing strategy I could do it in no time.
Sure, I had to move from different design decisions to others but that’s normal in a codebase, consider that whenever you touch some code. Sooner or later you will need to change that and how coupled it is to other components that will make the difference in the long run.
If you want to have a look at the code, you can find it on my Github Profile: https://github.com/dlion/gowc.
So what do you think about my solution?
Did I miss something? Can I improve it? Can it be more idiomatic?
Of course this solution can be definitely improved, and the exercise wasn’t about reading large files (given the example file).
In that case I would have implemented the reader in a different way in order to not have the entire file in memory and so on.
Implement your solution and let me know what you think about this challenge.
Happy Coding!
]]>A few weeks ago, I’ve been interviewed by PointerPodcast about my experience during my Take 3 and how I work in Tanzu Labs.
It was quite fan, even tho in some points I feel there will be more and more to talk about, I just barely scratched the surface.
Yeah sorry for any english speaker, it’s in Italian 🇮🇹
You can find the episode on Spotify and Apple Podcast here:
]]>I’d like to start with how I ended up working full-time on an open-source project, and to do so, I have to describe what a VMware Take3 initiative is.
VMware, the company I currently work for, gives everyone the fantastic opportunity to join another team within the company for a limited amount of time, usually 3 months (that’s why the 3 😉).
Every team that needs some help or that is open to accepting temporary new joiners can publish an opportunity in an internal platform. Sometimes they require specific skills, proficiency in a particular stack, or just people with the motivation to learn new things.
Everyone who has the approval can apply to those opportunities, and have a conversation with the manager who published it to verify whether that person might be a good fit or not.
As you can imagine I went through this process early this July and, I applied for a Take3 to join a small open-source team called CNB - Cloud Native Buildpacks.
TLDR: Buildpacks turn your code into OCI-compliant containers. They examine your source code, build it, and create a container image with all the required dependencies to run your application. ⚡️
Using Dockerfiles can be exhausting, you have to make lots of decisions like which base image you want to use and which version, being sure that all your application dependencies are ok with it. After that, you need to bring additional dependencies, and runtimes, build your application, and finally optimize all these operations to have an optimized container.
Cloud Native Buildpacks on the other hand would take care of all these steps, at least for most of the common use cases.
Your container also needs to be maintained over time, using your Dockerfiles you don’t have a real separation between the base image, the runtime, your dependencies, and your application so updating the image would require rebuilding it every time.
Cloud Native Buildpacks create different layers that can be swapped with new versions like Legos. 🧱
If you want to know more about it, have a look at the Cloud Native Buildpacks website.
CNB stands for Cloud Native Buildpacks, it’s a small team composed of amazing people who are spending their daily time on repositories like Buildpacks/Pack and Buildpacks/Lifecycle.
Their main duty is to help the Buildpacks maintainers and the Buildpacks community, they take care of the issues, develop new features, attend numerous working group meetings, and help release new versions.
The Open Source world is HUGE, and a standard to follow doesn’t exist yet, so every experience can be unique. 👈🏻
The team’s favorite way of working is to be async as much as possible due to the different timezones involved and then take advantage of the overlapped hours to pair/sync if necessary.
We were using an async standup where we could raise blockers and/or keep the rest of the team up-to-date.
During the week we had different meetings:
Most of my mornings were free from meetings due to the different time zones, so I could focus on getting things done. 🚀
Getting things done meant continuing my tasks in progress, reviewing issues opened, and reviewing PRs. Being in a different timezone allowed me to interact efficiently with whom was in my timezone as well.
In the afternoon it’s when I had most of my meetings, I could catch up with the rest of the team and join the necessary meetings explained above, besides, of course, our pairing sessions.
The expectations that have been set for me was to be able to deliver one feature at the end of the 3 months of the Take3.
My todo list was kind of similar:
Despite my long experience jumping from one stack to another, I’ve had just one occasion to push any written Golang code to prod so far, and it was a few years ago. Now, the challenge was to re-learn it and, to kind of push it to prod.
Kind of because our “prod” was a “release” state, not a real prod env, since the software I was working on was mainly a CLI app.
Honestly, I liked Golang and I was able to use it quite effectively in no time. I liked to work with this language and I would like to work again with it in the future. 👀
Thanks to my years of experience jumping from one stack to another as a consultant Extreme Programmer, I could re-learn the basics of the Go language quite quickly and, I’ve got my first feature merged within one week.
I picked up one first good issue: https://github.com/buildpacks/pack/issues/1800 and created a PR that has been merged within 1 week: https://github.com/buildpacks/pack/pull/1810. 🥳
At that moment, I was officially a paid Open Source contributor. 🚀
I started contributing to the Open Source when I was young, I think I was 17 years old, this is my first contribution ever: https://github.com/toshidex/DefollowNotify/pull/1.
Then I ‘ve been an Hacktoberfest contributor: 2017, 2018, 2019, 2020.
And finally, after so many years, I was a paid open-source contributor. I was so happy that I had to share it on Linkedin too.
Months have passed now, and I’m back at Tanzu Labs, happy to have had the time to help this amazing team and learn about Buildpacks internals.
It was an amazing experience I’d try again in the future.
I led and facilitated a user journey mapping session with some our users about a flatten feature, it was really interesting getting feedback from our users, shaping the functionality based on their feedback and of course, I was happy to have put my facilitation skills to help the project.
I over-achieved the initial goal of delivering just one small feature, but I’ve got passionate about the project and, I couldn’t help myself delivering just one small thing, I wanted to have an impact and deliver as much value as possible to our users.
I’d like to thank:
I’d like to start by saying that I’m not a big fan of getting as many certifications as possible, I’m more affected by the greedy learner pathology which pushed me forward on this route.
This indeed is my first real certification.
I started to study AWS just for fun and to desire to become a better professional. I wasn’t really interested to get any certification.
I started it for fun and it became even more fun over time, digging deeper into AWS services and use cases, so at the end of the path/course I felt that after all the effort I could try to challenge myself even more by getting the
certification.
I LOVE GOING OUTSIDE MY COMFORT ZONE
Did I need it? No.
Do you need it? Probably not.
It was just a matter of challenging myself, nothing more 🤭 so my advice is to enjoy the journey and learn as much as possible but just for your own sake, it will give you the extra motivation you need to pass the exam at the first try. 🎯
I work at VMware Tanzu Labs right now, and we often jump from one project to another; it’s always fun and it gives the possibility to work on different things and don’t get bored. 🤩
One thing that I noticed was that during my career I’ve always been facing AWS architectures at least once per year.
At least one client had their ecosystem on AWS.
Working for a Multi-Cloud company gives me the possibility to have a T-shape skills-set, focusing more on the how
than the what
; I mean, I worked on Azure and GCP as well, but to be honest the most fun
ecosystems I worked on were on AWS. 🤭
So I felt that I needed, for my career and for my interest (📚) to fill out some gaps that I had on AWS to become a better engineer and a better professional. 💪🏻
Working at VMware also means I have lots of benefits 😌🙏 and one of them is the possibility to access https://acloudguru.com/ courses/resources/labs FOR FREE! ✨
A Cloud Guru is well known to be very expensive but to have one of the best playgrounds out there.
Yes, essentially you can just go to their playground section
Open a new session of their sandbox and log in.
From that moment till a bunch of hours, you will have the chance to play with an almost real AWS environment, bill-free. 👀
Of course, A Cloud Guru has its course: https://learn.acloud.guru/course/aws-certified-developer-associate
I took it and I can say that it covers more or less everything you need to know to pass the exam.
Especially I want to call out the 4 mocked exams which have been super useful to get to know the exam env and the questions’ style.
Another resource I found useful to reinforce my knowledge was https://tutorialsdojo.com/
It’s a website that contains a study path for each certification.
Here the one for the Developer Associate: https://tutorialsdojo.com/aws-certified-developer-associate-exam-guide-study-path-dva-c02/
In Tutorials Dojo you can also find mocked exams which as far as I heard are very close to the real ones but I haven’t purchased it so I don’t have a personal opinion or experience on it.
Listed either on Tutorials Dojo or A Cloud Guru you can find a list of recommended whitepapers from AWS that are worth reading.
I know, they are quite big but I think that reading them once is worth your time.
I also really liked reading about other experiences on /r/AWSCertifications/, it is full of nice advice and great people who can help you out.
There I discovered that there are other recommended resources that I didn’t follow, maybe it’s worth having a look at it. 👀
Every person has their unique approach so there aren’t good or wrong ways to study.
Having a full-time job it’s always complicated to find the time and the energy to study, for this reason, I repeat that you should study only because you want to learn new things, improve yourself and become a better professional.
The certification itself IMHO doesn’t add anything up to your skills-set.
I studied for 2 months more or less, dedicating myself to it almost every day for at least 25 minutes.
Something I would like to highlight here is that I have more than 10 years of experience as a Software Engineer so your experience can be different and you might need more time and resources.
Obsidian is my main tool for taking notes, creating blog posts, scheduling my day, keeping track of everything, and of course: studying.
I switched over to Obsidian from Notion and I will never get back to it.
My main way to study is to take notes about whatever I read/watch and then look at it later on, to memorise better and freshener those concepts.
Spatial repetition works very well for me even tho I’m not very consistent.
Another amazing tool I’ve been using to memorise better is Excalidraw, an infinite canvas that I used to divide each topic with its information.
It has been very useful to visualise the information that I’ve got from the video course and blog posts.
Worth to mention that Obsidian has an Excalidraw plugin which means you can use all Excalidraw functionalities from your Obsidian instance, having your draws locally.
For me having a visual representation works quite well and it helps to remember better.
As mentioned before doing mocked exams IS KEY to pass your exam, it helps you to get familiar with the questions and the timing.
A Cloud Guru gives you 4 exams that you can practice with.
My strategy was:
My goal was to pass each mocked exam with at least 80% of correct answers.
I’ve done it multiple times during the 2 months I spent studying.
Don’t focus too much on the specific questions but more on the topics they cover.
A Cloud Guru helps a lot with the hands-on part but it doesn’t mean you can’t open an AWS free-tier account and try by yourself to play with AWS services.
Right now the exam is composed of 65 multi-choice questions so at first it seems not hands-on oriented but that would be the wrong assumption.
Lots of questions are about specific technical details and particular services options that it’s easier to know if you had the chance to play with them and having a hands-on experience is always better considering the
nature of this certification.
Coming to the end of the article, I’d like to say again that I think that the certification per se says nothing about your competencies and skill set.
Yeah, it shows other people that you can stick to a plan, go out of your comfort zone and that you are capable of learning new things.
I’ve been a software engineer for more than 10 years so far, and I can say that having a certification doesn’t mean you are a better professional than those who don’t have it.
Don’t be too hard on yourself, if you don’t pass the exam, if you don’t complete the course or if you drop it after a few months.
Keep trying, keep being motivated thinking about what you are learning more than what you can do with the certification.
Good Luck! 🍀
]]>It’s been a fun and rewarding journey, enhancing my expertise in building cloud-native applications with AWS services! ☁️🔭
]]>Probably the most well-known principle, I’m talking about the Least Privilege
design principle.
This principle revolves around the concept of giving just the required privileges to a specific user/application to operate correctly, which means with the fewest privileges possible.
Following this principle makes unintentional or improper uses of privilege less likely to occur.
A few points to remember 👇
This principle is often called the non-bypassability
principle. Essentially, it states that every access attempt coming from an external domain should be checked, and especially, don’t act on the data received before validating that the request came from a valid source.
By following this principle, we have thorough and consistent authorization checks at every access point in a software system, protecting data and enhancing the security of our application.
This is the simplicity
principle, also called KISS.
Security and over-engineering are always a dangerous duo. Having a security mechanism with lots of hidden features and intricate components can increase the chances of something going wrong.
The rule to follow this principle is to keep your security mechanism simple. Don’t try to reinvent the wheel or overcomplicate the solution – keep it simple.
A simple system is easier to review, maintain, and test, and harder to get wrong.
The Open Design principle is often underestimated, but it’s one of the most powerful.
Essentially, it states that an attacker shouldn’t be able to break into our system just because they know how it works. Relying on the ignorance of the attacker to protect our system is always a big mistake.
We should always act as if the security mechanism is publicly known and depend on the secrecy of a few easily changeable items like credentials.
The opposite of the Open Design principle is called Security Through Obscurity, and there have been multiple documented cases that prove it doesn’t work.
Moreover, having an open design makes extensive public scrutiny possible and gives confidence to any user who knows about the mechanism used that our software is secure.
The rule of this principle is that in situations where a decision or authorization cannot be explicitly determined, the system should default to the most secure option.
Don’t distribute software with an empty or default password; instead, force the user to set it up during the installation process.
How many times have you seen default passwords being used by unaware users? By applying this principle, developers can minimize the risks associated with incomplete or erroneous authorization decisions.
This principle states that access to critical resources or sensitive operations should depend on more than one independent condition. In this way, even if an attacker manages to break one condition, they still need to break the others to compromise the system’s security.
It promotes the distribution of privileges and responsibilities to multiple independent entities, reducing the potential impact of compromised accounts.
This principle focuses on reducing the amount of shared resources or dependencies between different components of a system. In some cases, sharing can reduce costs, but it increases security risks.
Following this principle leads to having more modular and robust software. For example, we can establish separate database connections for different components or modules instead of using a single, shared one. This approach minimizes the chances of conflicts or bottlenecks and allows for better isolation and scalability.
This principle is more user-centric. It states that the security mechanism’s user interface must be designed to be user-friendly and simple to use.
If something is hard to use, it is often insecure in practice because users will work around it to make their lives easier.
One easy example is defining rules for passwords; after a few attempts, users will resort to using very simple passwords just to pass all the checks and move on, causing the opposite effect.
These security design principles are not mere theoretical concepts; they provide practical guidelines that can be applied during the software development lifecycle, which, for me, is GOLD.
Of course, these security design principles are guidelines, and there may be good reasons not to apply them in some cases. It is important to think about them wisely and consider the specific context of your software development project.
Remember, security is an ongoing process. We must remain vigilant, continuously assess our systems, and act accordingly.
As software engineers, we have the responsibility to build software that not only meets functional requirements but also prioritizes the protection of user privacy, data integrity, and system reliability.
]]>The alias I created for my account is dlion@domenicoluciani.com
where dlion
is my nickname, and domenicoluciani.com
is my custom domain.
If you try to search it on Mastodon, you will find my current account, which resides on mastodon.social
I discovered that Mastodon uses ActivityPub to communicate between different actors and that those actors are found using WebFinger, a way to attach information to a specific email address or other online resources.
So I just needed to implement the WebFinger spec on my domain to have it working.
On your Mastodon instance, you have an endpoint called .well-known/webfinger
, which accepts a query parameter that allows other Mastodon instances to get information around a particular account.
<youmastodonaddress>/.well-known/webfinger?resource=acct:<yournick>@<youmastodonaddress>
For instance, in my case, doing a curl GET request to this URL:
https://mastodon.social/.well-known/webfinger?resource=acct:dlion@mastodon.social
I get the WebFinger response for my account:
{
"aliases" : [
"https://mastodon.social/@dlion",
"https://mastodon.social/users/dlion"
],
"links" : [
{
"href" : "https://mastodon.social/@dlion",
"rel" : "http://webfinger.net/rel/profile-page",
"type" : "text/html"
},
{
"href" : "https://mastodon.social/users/dlion",
"rel" : "self",
"type" : "application/activity+json"
},
{
"rel" : "http://ostatus.org/schema/1.0/subscribe",
"template" : "https://mastodon.social/authorize_interaction?uri={uri}"
}
],
"subject" : "acct:dlion@mastodon.social"
}
I need to put this response on my server under the same directory and inside the same file and that’s it.
For my blog, I use GitHub for the hosting; specifically, I use github-pages
which means using Jekyll
, a static site generator.
To have your Mastodon alias in your custom domain using Jekyll, you need to:
.well-known
..well-known
, create a new file called webfinger
.webfinger
file, put the response you get when you curl your actual Mastodon instance WebFinger endpoint, as mentioned before._config.yml
, add include: ["/.well-known" ]
to include that directory in your rendering.And it’s done. Just push and wait a bunch of minutes. Your alias will be founded as anything@your-custom-domain.whatever
by Mastodon, redirecting everyone to your actual account.
If you want to know more about WebFinger, you can have a look at the original website of the spec and at the Mastodon’s documentation:
I found out about this method thanks to this article by Maarten Balliauw:
]]>Do you use pair programming? How do you do it?
— Swizec Teller encouraging you to Be An Expert (@Swizec) February 22, 2021
It often feels like a more tiring way to move slower to me. Great for solving gnarly bugs together, but not for coding.
I read the complaint that 8 hours of pair programming is a nightmare. It is ruining their life, draining all their mental energies, and making them brainless at the end of the day.
I mean, pairing IS tiring, and our job IS tiring, BUT let me tell you a thing, and I want to make it clear:
If you are pair programming for 8 hours straight, you are doing it wrong!
Pair programming is a powerful and brain energy-consuming activity; therefore, taking breaks IS KEY. But pairs are frequently so focused on the task that they forget until they suddenly realize they’re exhausted.
Here’s one technique I use to remember to take breaks: Pomodoro.
Essentially it forces you to take a break every 25 minutes of straight pair programming. The break usually lasts 5 minutes, during which you shouldn’t do anything related to that specific task. Do something else, step away from screens, look out the window or refill your glass of water.
Of course, you can be flexible according to the needs of you and your pair, but you shouldn’t push too hard on that 25-minute limit. Taking breaks is part of this activity; you and your pair should do it frequently. It’s essential to have breaks and rest. Please do it.
Another false assertion is that pairing slows down the development process.
Why have two people on the same task when you can have two people on two different tasks and speed up the development?
It seems a fair statement on the surface, but let’s look deeper:
Doing code reviews is a common practice in multiple realities. Once you are done with your story, another (or sometimes more than just one) person reviews and ultimately accepts the code you wrote to be merged.
What are the pitfalls of this practice?
Working alone on a task is cool until someone else on the team needs to know what you have done and how, but you are not available to ask.
Handing your knowledge around that task will require explaining the code you have written, the context and the choices you have made, and hoping they get it quickly.
These are all the issues you can mitigate by working from the start with pair.
This way, two people build the same knowledge base around a specific task. Frequent rotations between pairs help to spread this knowledge increasing the collective ownership of the codebase.
Are you off tomorrow? No worries, the code will continue to be developed while you’re out because the team knows what you know about the problem you’re solving together.
Then don’t do it!
Pairing is a valuable activity on several levels but should be used when it makes sense. Otherwise, it’s just another practice you follow because someone said you should.
If you see a task (i.e. adjusting some documentation) that doesn’t require pairing, do it yourself. It is okay to have a solo moment, and it’s up to you to decide whether or not to pair with someone else.
True, remote pairing is more complicated than pairing physically. However, nowadays, we have some tools, practices and equipment to overcome these challenges as much as possible.
Pairing is not only about showing what you know but also showing what you don’t know.
It’s totally fine saying “I don’t know” during a pairing session. It’s fine to admit you have less experience or are just blocked.
Your pair have to support, unblock and guide you.
Pairing has this colossal benefit that cultivation is part of this activity. It helps juniors grow together by getting more confident, and it allows seniors to learn from less experienced engineers (it happens frequently).
Pairing is about trust and being honest and open with your pair, so don’t be afraid. Learning is a beautiful journey, and pairing is the safest way to do it.
Having more experience than your pair shouldn’t be a problem; it’s an occasion for mentoring.
The goal is to develop a shared context of the problem and solution. As a more experienced person, adjust your speed so your pair can follow along and ask you questions as necessary.
The pairing session is also an excellent way to clarify whether something is understood; if it can be explained in simple words, then you know it. Think of pairing with a less experienced colleague as a perfect way to improve your speaking and teaching skills.
There are probably many other complaints we could consider, but the result will always be the same.
Pair programming brings to your team tons of benefits if it is done correctly.
My advice is to set expectations at the beginning of each session. Agree on what the pairing session will look like and clarify the style together to avoid misaligned assumptions resulting in behaviours that can degrade the experience for both of you.
Happy Pairing!
Here there are some resources you can find useful to convince your boss that doing pair programming is the way:
I want to thank my colleague Judy for taking the time to read and review this article and for providing me lots of interesting advice and corrections.
Disclaimer: I’m still learning the language so any feedback would be very welcomed, I’m writing this article to describe my journey and what I learned so far, so any suggestion and correction will be definitely appreciated.
Working on a legacy code often means you have to use whatever it’s already there and in this case the testing framework that has been installed but that I’ve never used before was Jest.
Luckily most of these frameworks are quite similar but definitely have a look into the documentation, helps.
One of the problems we have working on a legacy code is that testing it is quite tricky due the fact the code wasn’t made to be tested at first place.
So you find these gigantic classes full of large methods that are impossible to test with a lot of dependencies, a big ball of mud.
A legacy codebase can be composed by thousands of classes, composed by hundreds of methods and lines of code, so where should we start from?
My first step into a legacy codebase is about identifying the core business logic, the critical one. I work with the client’s team to identify which part should never break and which part delivers value.
Once identified, I start retrofitting some unit tests, trying to cover the main use-cases ensuring that everything work as expected and start creating my safety-net.
Working without a safety-net is risky, will slow us down and increase the likelihood that we are going to break something.
Often you have lots of dependencies you need to take care of and most of the time those dependencies are doing lot’s of weird stuff you don’t know anything about so the safest way to test everything without worrying how those dependencies behave is to mock them, brutally.
In the following sections you can find some snippets of code I found useful during my journey on that project, handling legacy code could be tricky and having the right snippets at the right moment could be life saver.
I found out that compared with other languages Typescript allows you to mock an object easily thanks to the duck-typing.
Let’s see an example:
const readlDependency = {
functionIneedToMock: () => { ... },
anotherFunctionIwantToMock: () => { ... },
}
We can just create a new object which reflects the same properties composition of the dependency we need to mock and that’s it, Typescript compiler will identify that object as the same type automagically allowing us to use it seamless . ✨
const mockedDependency = {
functionIneedToMock: jest.fn(),
anotherFunctionIwantToMock: jest.fn(),
}
As you can notice, I’ve replaced the implementation with jest.fn()
which allows us to mock that function using jest functionality.
Let’s see a more concrete example:
it("Should test something", () => {
const mockedDependency = {
functionIneedToMock: jest.fn(),
anotherFunctionIwantToMock: jest.fn(),
};
const obj = new ClassUnderTest();
obj.methodUnderTest(mockedDependency);
expect(mockedDependency.functionIneedToMock)
.toHaveBeenCalledTimes(2);
})
In the previous example we want to be sure to have our mocked function called 2 times. Easy right?
How about the implementation? Easy peasy!
const mockedDependency = {
functionIneedToMock: jest.fn(() => "hello world"),
anotherFunctionIwantToMock: jest.fn(),
};
jest.fn(implementation)
is a shorthand forjest.fn().mockImplementation(implementation)
Everything look nice when you can inject your dependencies but in the Typescript world you can avoid passing parameters and just import whatever you need and use it wherever you want more or less, how can we mock or spy on those dependencies? Let’s see an example how to do it:
import { MessageService } from "../message-service";
const messageService = new MessageService();
export class Manager {
methodWhichUseLotsOfDependencies() {
....
messageService.publishMessage("hello");
....
}
}
and then let’s try to spy on our dependency using Jest:
import { MessageService } from "../message-service";
const spiedPublishMessageService = jest.spyOn(
MessageService.prototype,
"publishMessage"
);
it("Should publish the message", () => {
const manager = new Manager();
manager.methodWhichUseLotsOfDependencies();
expect(spiedPublishMessageService)
.toHaveBeenCalledWith("hello");
}
Simple and clean, essentially we are spying on our dependency, specifically on the publishMessage
method and then asserting that it receives hello
as parameter.
Of course we can always program the mocked dependency behaviour since the jest.spyOn
method returns a Jest mock 🏋🏻♂️
Often in legacy codebases we may find multiple static methods, used improperly.
How can we handle them with Jest?
We have a Mapper
class with a static mapSomethingToSomethingElse
method that we want to mock, for instance:
export class Mapper {
static mapSomethingToSomethingElse() { ... }
...
}
our mock:
import { Mapper } from "../mappers/mapper";
jest.mock("../mappers/mapper", () => ({
Mapper: {
mapSomethingToSomethingElse: jest.fn()
.mockImplementation(() => {
return "dummyMock";
}),
}
});
And then we can perform some assertions like:
expect(Mapper.mapSomethingToSomethingElse)
.toHaveBeenCalledTimes(1);
Testing an AWS Lambda handler is very similar to what we already done, for instance:
export async function ApiGatewayDoSomethingWithLambdaHandler(
event: APIGatewayProxyEvent,
context: Context
): Promise<APIGatewayProxyResult> { ... }
And our test will look like something similar:
Let say we want to spy on our AuthService providing dummy dependencies
const mockedAuthService = jest.spyOn(AuthService.prototype, "checkPermissions");
it("Should test the handler", () => {
const dummyProxyEvent: Partial<APIGatewayProxyEvent> = {
headers: { Authorization: "dummyToken" },
body: "dummyBody",
};
const dummyContext: Partial<Context> = {
awsRequestId: "dummyAwsRequestId"
};
const response = await ApiGatewayDoSomethingWithLambdaHandler(
dummyProxyEvent as APIGatewayProxyEvent,
dummyContext as Context
);
expect(mockedAuthService)
.toHaveBeenCalledTimes(1);
})
As you can see we used the Partial
type to avoid moking every property of those objects and then passing them to our handler, asserting that one of our mock has been called correctly.
I still need to learn more than a bunch of things around this very powerful language and I’ve just scratched the top of the iceberg and I’m looking forward continuing exploring this world called Typescript.
As an Extreme Programmer I like to explore new things, I like to try new stuff and I like to solve problems and overcome challenges and jumping from one project to another keep me pushing my self outside my comfort zone trying to learn at least something new, every day.