Developing a Personal Project, Part 2: Research

12 May 2023 -

In the first part of this series, I explained my reasoning around the broad choices that you can make at the start of a personal project. In this second part, I will discuss the logical next step: research.

At the end of the last blog post, I described how I resolved to make a Discord chat bot using Elixir. With this concrete goal firmly set in mind, I can now go about researching how to best use these tools. This can be a very open-ended process, ranging in complexity from querying a ChatGPT sesssion, to blindly Googling questions until a vague idea forms in my mind, to performing comprehensive analyses of technical documentation. I recommend a slow and thorough approach at this stage. Misreading documentation early on can have you spinning your wheels for hours or even days when it comes time to implement basic features. Coming across a new framework may tempt you to alter your set objective, taking you nearly back to square one. Much as with the objective setting process, the technical research process is an opportunity to limit the scope of your project and speed you on your way to success. In short, take your time.

Given my objective, there were two natural starting points for research: the Discord API, and the Elixir application. Seeing as I’m already familiar with building Elixir apps, I started by reading the Discord documentation. Discord has built a powerful web API for interacting with their platform. They give developers access to a wide gamut of resources, some of which are very exciting to consider, like playing sounds via the voice channels. But I’m just here to make a digital assistant, so I’m interested in two things: creating a persistent connection between an Elixir application and passing data between that application and Discord.

Reading through the Getting Started guide, I learn that there are three things I need before I can break ground on any code. Firstly, I need a Discord account, which conveniently I already have. Secondly, I need to register a new application on Discord’s developer portal. Thirdly, I need to register a new bot user. This is a key distinction that may help avoid some confusion down the line: your application, your code, is running as any application would, either locally on your machine, or on a hosting service (e.g. fly.io). Meanwhile the “bot” is actually just a “user” that is added as a member to a given Discord server/guild. The bot can be given any permissions that a user can have, and it acts as the visible presence of your application within the server. Any number of servers can have a bot user connecting to your application (though over a certain threshold of servers Discord imposes a review process).

Once I have my bot user ready, I encounter my first technical decision: WebSockets vs. REST. Discord actually has two APIS, one called the Gateway API which is WebSockets-based, and another which uses a classic REST-like pattern. There’s plenty of strengths and weaknesses to consider between these two approaches at scale. HTTP endpoints are more widely adopted and supported, but can have trouble with concurrent requests. WebSockets connections provide truly realtime communication, but they’re not supported by older browsers and you must implement a protocol for recovering terminated connections yourself. These considerations prompt a followup question: do I want to implement the logic for the network requests myself, or do I want to use an existing library? To help make this decision, I revisit my goal of deploying this app. Writing my own WebSockets connection would be a good exercise, but with experience comes humility, and I’m quite certain that my boilerplate WebSockets code will not be as reliable as that of an open source community. A compromise, for learning’s sake, would be to read and understand the source code. At the end of the Getting Started guide, I found a link to a Community Resources page: in other words, links to wrapper libraries. A wrapper library provides methods for interacting with an API in the language of your choice. Any API of notable popularity will have a host of these. Most Discord libraries seem to be for Javascript, but a quick Google search finds an Elixir one for me: Nostrum. The documentation seems robust enough, and there’s an active community around the project. A quick peruse of the docs shows me that Nostrum implements a WebSockets connection to the Gateway API, and just like that my decision is made for me. I can make decent headway while learning from an open source project.

In the next post I’ll talk about my system design and how my research distills into a planning document.