coding

Using Commercial Models

This is the 3rd part of my series on running AI models locally with Simon Willison’s LLM tool. The other 2 parts are here:

As I mentioned in the previous post, you can only go so far with models running locally. When you out-grow these models, you’ll probably want to step up to a paid model such as something from OpenAI. This is easy to setup.

OpenAI Setup

First, go to openai.com and login to the API Platform. If this is your first login, you’ll need to setup an organization (I used my name for this), and a project (mine is called Default project).

Next, go the Billing section to add a Credit Balance. You can start off small here. I added $10.00. Check the Pricing page in the OpenAI docs to see costs for each model. The prices are per million tokens, so even $10.00 should last a while.

Finally, go to API Keys and create a new key. You’ll want to save this in a secure location. I added mine to 1Password. To use the API key, you’ll need to make it available to your Python program.

Environment Variables

The easiest way to use your API key would be to enter it directly into the code, but this is a bad idea if you host your code somewhere like GitHub. Anyone could take your key and use up all your credits.

The better (and default) option is storing your API key in the environment variable OPENAI_API_KEY. To set this variable add export OPENAI_API_KEY=<your key> to your ~/.zshrc file and restart your terminal. You might also want to look into python-dotenv for loading environment variables.

If you’re using the LLM command line app, set the key with this command:

llm keys set openai

LLM will also use the environment variable if it is set.

Using LLM

Now we’re finally ready to write some code. Let’s start by using LLM. The Python API documentation covers all of the features. Here’s a simple example:

Initialize a new project with uv and add the llm library:

uv init llm-openai
cd llm-openai
uv add llm

Note, that we didn’t add llm-mlx this time. We’re using the model run by OpenAI, not a local model. Here’s a simple program to generate a haiku. Add this to main.py

import llm
model = llm.get_model("gpt-4o-mini")
response = model.prompt("Write an inspiring haiku about AI.")
print(response.text())

Now run this with uv run main.py.

Using the OpenAI Python Library

You can also use the official OpenAI Python library if you like. The steps are basically the same.

Initialize a new project and add the openai library:

uv init openai-api
cd openai-api
uv add openai

The program is very similar. Add this to main.py:

import os
from openai import OpenAI
client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY"),
)
response = client.responses.create(
    model="gpt-4o-mini",
    input="Write an inspiring haiku about AI.",
)
print(response.output_text)

Loading the API key from OPENAI_API_KEY is the default. You can leave that out if you want. Run this with uv as before:

uv run main.py

You’re now well equipped to write code using both local-run libraries and the commercial models from OpenAI. I hope you learned something from this series. I can’t wait to see what you build.

LLM Workflows

There are as many ways to use LLMs to write code as there are engineers experimenting with them. I love reading about LLM workflows, picking up tips, and trying to streamline my own setup.

I’ve already linked to Simon Willison’s Here’s how I use LLMs to help me write code, but I’ll start with it again. Simon describes using LLMs to write code as difficult and unintuitive which I agree with, especially when I was just getting started. His post then includes a bullet point list of tips.

An older post that I just came across is How I use LLMs by Karen Sharma. This one focuses on using Aider to write and run code. As someone who used to spend almost all of my time in a terminal, I can appreciate this workflow. Aider is very powerful. If I was still spending all my dev time in tmux and vim, I can definitely see myself using something like this.

While you’re at Karen’s blog, check out Cleaning up Notes with LLM. He wrote a Python script using LLM to clean up Obsidian files. Not only is this a great example of LLM use it’s also something that I, as an Obsidian user, could run on my own files to generate some better tags and categories.

Next, up Harper Reed shares My LLM codegen workflow atm. I appreciate the fact that he separated greenfield from legacy code. Many people focus on using LLMs for greenfield, but as an engineer at a large corporation most of my time is spent working on existing code.

Harper’s workflow for greenfield development is really interesting. He uses one LLM for planning. This generates a prompt_plan.md and a todo.md checklist. He then feeds those files to another LLM either directly to something like Claude or using Aider. For non-greenfield he uses a set of mise tasks to get context and feed it to the LLM.

Harper followed up the original workflow post with Basic Claude Code. The workflow is similar to the previous post, but it uses Claude Code instead of Aider. Claude Code opens the generated files and handles everything that needs to be done.

As for me, I’ve been trying to follow An LLM Codegen Hero’s Journey. I’m currently using GPT-4 through Copilot in Visual Studio Code. Having an employer provide free access to all the tools is really nice. For most of my work I’m using autocomplete, but I’m trying to let Copilot do more. I guess that puts me somewhere around step 3.

I’ve spent a good chunk of time working on instruction files for VS Code. This improves Copilot’s ability to make changes to our existing code. An interesting trick is to have Copilot enhance your instruction file using what it knows about your application. I was able to catch and correct some misunderstandings which made Copilot that much better.

If you have an interesting workflow or a link to someone else’s workflow, please share it with me on Bluesky. We’re all still learning and sharing is caring.

Using AI In Your Own Apps

In my previous post I covered how to run a large language model on your own computer. I covered how to send prompts to it and start a chat. The next step is adding an LLM to your own applications. The llm library makes this easy.

First, initialize a new project with uv and add the llm and llm-mlx libraries:

uv init llm-test --python 3.12
cd llm-test
uv add llm llm-mlx

Now open up main.py with your favorite editor and make it look something like this:

import llm

def main():
    model = llm.get_model("mlx-community/Llama-3.2-3B-Instruct-4bit")
    response = model.prompt(
        "Tell me a joke."
    )
    print(response.text())

if __name__ == "__main__":
    main()

There are three steps to the process:

Get the model we installed in the previous post.
Get a response by prompting the model.
Print the response text.

You can now run your new AI program with uv run main.py to see the result.

Once this is working, see the LLM Python API documentation for more information. For example, maybe you’d like to add a system prompt. Here’s a classic:

import llm

def main():
    model = llm.get_model("mlx-community/Llama-3.2-3B-Instruct-4bit")
    response = model.prompt(
        "Tell me a joke.",
        system="Talk like a pirate."
    )
    print(response.text())

if __name__ == "__main__":
    main()

In the documentation you’ll also find support for conversations that allow you to have an ongoing chat. Here’s a simple example of a CLI chat bot:

import llm

def main():
    model = llm.get_model("mlx-community/Llama-3.2-3B-Instruct-4bit")
    conversation = model.conversation()
    
    print("Welcome to my LLM chatbot! Type 'exit' to quit.")

    while True:
        question = input("What can I help you with? ")

        if question == "exit":
            break

        response = conversation.prompt(question)
        print(response.text())

if __name__ == "__main__":
    main()

Note that I added a new conversation with model.conversation(). Then instead of prompting the model directly, I prompt the conversation. This allows the model to remember previous questions so you can ask follow-up questions.

From here the possibilities are basically endless. You could use your favorite web framework to create your own web-based chat bot or use fragments to analyze external content.

Next time I’ll cover using a commercial model. Until our personal computers get much faster we’ll only be able to go so far with local models.

Run Your Own AI

Large Language Models seem to be taking over the world. People in all kinds of careers are using LLMs for their jobs. I’ve been experimenting at work with using an LLM to write code with mixed, but promising, results.

After playing around with LLMs at work, I thought it might be interesting to run one locally on my laptop. Thanks to the hard work of several open-source developers, this is pretty easy.

Here are some instructions that I shared with my co-workers. These are specifically for Macs with an M-series processor. On a PC, skip the steps about MLX and use Ollama to download a model. Then install the llm-ollama plugin instead of llm-mlx.

Install uv

If you’re a Python developer, you might already have uv. If not, it’s easy to install:

curl -LsSf https://astral.sh/uv/install.sh | sh

If you’re not familiar with uv, the online documentation describes its as “An extremely fast Python package and project manager, written in Rust.”

Install llm

Now that we have uv installed, you can use it to install Simon Willison’s llm:

uv tool install llm --python 3.12

Note that I specified Python version 3.12. One of the dependencies for the llm-mlx plugin doesn’t support 3.13 yet.

Install llm-mlx

MLX is an open-source framework for efficient machine learning research on Apple silicon. Basically what that means is MLX optimized models run much faster.

llm install llm-mlx

Again, if you’re not on a Mac, skip this step.

Download A Model

The llm CLI makes this easy:

llm mlx download-model mlx-community/Llama-3.2-3B-Instruct-4bit

This installs a model from the mlx-community, optimized for a Mac.

Run The Model

It’s finally time to test everything out:

llm -m mlx-community/Llama-3.2-3B-Instruct-4bit "Tell me a joke"

If you followed these steps, you should see a joke. AI jokes are kind of like dad jokes. People seem to groan more than laugh.

Chat With The Model

Rather than make one request at a time, you can also chat with local models:

llm chat -m mlx-community/Llama-3.2-3B-Instruct-4bit

This is how most of us interact with LLM models online. The chat option responds to whatever you type at the prompt until you enter “exit” or “quit”.

What’s Next?

Maybe try a different model and compare the results. The mlx-community on Hugging Face has lots of options. Beware, some of these are very large. In addition to a large download, they also require a lot of memory to run locally. For another small model, you might want to try Qwen3:

llm mlx download-model mlx-community/Qwen3-4B-4bit

Check out Simon Willison’s blog. He shares tons of interesting info covering the world of AI. I have a bad habit of leaving his posts open in tabs. Here are a few I have open right now:

Building search-based RAG using Claude, Datasette and Val Town – an in-depth look at how he uses LLMs to build tools.
Run LLMs on macOS using llm-mlx and Apple’s MLX framework – the basis for this blog post.
Here’s how I use LLMs to help me write code – it’s always interesting to see someone else’s process. Lots of useful tips here.
Not all AI-assisted programming is vibe coding (but vibe coding rocks) – To some folks vibe coding is any time you use an AI to assist, but that’s really not what it means.

Finally, a really interesting thing to me is embedding an LLM in an application. I’ll cover that in another post.

Crafting Interpreters

I’ve been working my way through Crafting Interpreters by Robert Nystrom for the past few days. It’s an excellent book about creating your own programming language.

I was already familiar with Nystrom’s previous book Game Programming Patterns. A great book on design patterns, even if you’re not a game programmer.

Crafting Interpreters covers a new object-oriented scripting language called Lox. You’ll first implement a tree-walk interpreter in Java and then create a bytecode virtual machine in C.

I probably won’t ever directly use these skills in my day job, but gaining a deeper understanding of interpreters and virtual machines can’t hurt. Especially considering I split my time between the Ruby interpreter and Java VM.

Almost as interesting as the book itself is the way he wrote it using a combination of Markdown for prose and hand-drawn diagrams. More details are in his blog post Crafting “Crafting Interpreters”.

If the book sounds interesting to you, you can read it online for free at the link above. After reading the first few chapters onscreen, I bought a print copy. I like having the book open on my desk while writing code.

Code Kata

Over the course of interviewing for a coding job, you’ll be required to prove that you actually know how to write code. This is usually by solving a problem of some kind. This might be a take-home challenge, a pair-programming exercise, or even (shudder) coding on a whiteboard. Obviously, you’ll want to practice these kinds of problems.

Many of the problems used in interviews can be found online in the form of a code kata. The name code kata comes from the kata in martial arts. In martial arts this involves practicing a set of moves repeatedly until they become like a reflex. Here’s the definition of a code kata on Wikipedia:

an exercise in programming which helps programmers hone their skills through practice and repetition.

I’m not really happy with that definition. To me, a code kata is different in that you shouldn’t repeat the same problem. Instead, solve a variety of challenges to build your problem-solving and coding skills. Of course, if you aren’t happy with a solution, refactor it or even start over, but I feel like the biggest gain comes from solving a many different problems.

So, where do you find these problems? Here are a few websites I’ve used in the past:

Project Euler has hundreds of challenging mathematical and computer programming problems. If you work your way through this list, you’ll be ready for anything. Many of these will be too difficult for a beginning programmer.

CodeKata is a site by PragDave (aka Dave Thomas of the Pragmatic Programmers). According to Wikipedia, Dave Thomas was probably the first person to use the term code kata. This site has a collection of 21 challenges.

Exercism has a collection of practice problems in over 30 different programming languages. If you only visit one site on this list, start here. After you submit your solution, you can see other users solutions and give and receive feedback.

Refactoring

The sites above involve solving a problem of some kind, but The Gilded Rose kata is all about refactoring. You’re given a program and asked to add a feature. Unfortunately, the program is poorly written and has no tests. Rather than just add the new feature, you’re expected to also clean up the program.

After you try the kata on your own, check out the talk All the Little Things by Sandi Metz at RailsConf 2014. Sandi Metz is the author of the highly recommended book Practical Object-Oriented Design in Ruby and a great speaker. If you’re a Ruby developer, you should watch all of her talks (assuming you haven’t already seen them).

What is your experience with practice problems? Did I miss a great code kata resource? Do you think this is all a waste of time? I’d love to hear your thoughts about coding challenges such as these.

Your Resume

Now that your LinkedIn profile is looking good, use the information it contains to write your resume. A resume is not strictly required since most companies are happy with just a LinkedIn profile, but it never hurts to have one.

There are many fancy resume templates available online, but I recommend something clean and simple. I follow a format similar to the way my LinkedIn is setup.

Header

First, put your name at the top in large text. I used Microsoft Word’s Title style, a 28-point font. Under your name, put your LinkedIn address and contact information. Include at least your email address and phone number. Finally, finish off the first section with your summary statement. Here’s mine:

Experienced Engineering Manager and Software Engineer with a BS in Computer Science. Author of Rails Crash Course. Speaker, community organizer, teacher, mentor, blogger.

Experience

Add a heading, then list your experience. Put your job title in bold, followed by the company name and dates of employment. For your most recent roles, give 3-5 bullet points with the things you do as part of your job. Also include a list of technologies used in case someone is just scanning for keywords.

If you don’t have a lot of professional experience, include volunteer work or personal / freelance projects here. This is especially important if you’re seeking your first job as a software engineer. If you’re looking for a coding job, but your only experience is driving for Uber, then your resume isn’t really helping.

If you have a lot of experience, list fewer and fewer details for older jobs. For past positions maybe only list technologies used. For jobs that aren’t relevant to the position you’re currently seeking, only list the title, company, and dates. You might even leave these off assuming it wouldn’t look like a gap in employment.

Education

Finish up your resume with information about your education. List your college degree if you have one. List the bootcamp you attended and/or any online classes you’ve completed.

If you haven’t done any of these things, just leave this section off. I would avoid doing something clever like “graduate of the school of life.” Some hiring managers might think that’s cute, but others will probably discard your resume.

Done

I make no mention of skills other than as part of my employment history. I don’t see the need for a big block of skills. Anyone can type a list of programming languages, show me what you really know by listing some experience.

I also don’t mention references. Everyone knows you’ll provide references on request, so why bother taking up space on your resume? Many companies don’t even ask for references anymore.

My resume is just under two pages long. Many people will tell you that your resume should fit on a single page. I don’t see how that’s possible for someone with a few different jobs and some education. List everything you need and don’t worry about the length.

Finally, export your resume as a PDF. I don’t usually print my resume unless I’m going for an in-person interview. In that case, I print a few copies and stash them in my bag in case someone asks for it. So far no one has ever asked for a hard copy, but better safe than sorry.

Your LinkedIn Profile

The first thing I look at when someone applies for an engineering job is their LinkedIn profile. Even before I look at their resume. To me, there are two big benefits to going straight to LinkedIn:

I can easily see how we’re connected. Do we know people in common? Maybe you used to work somewhere I used to work. Maybe you met someone I know in a user group. This connection could tell me what you’re really capable of, even if you aren’t good at selling yourself.
It’s in a standard format. You can customize your profile page a little, but for the most part they all look the same. I know where to find your experience and education without thinking. It’s easy to quickly scan the page and get a sense of where you’ve been.

Intro

The first section is called the intro. As I said in my post about your GitHub profile, put a current, recognizable picture of yourself here. Next, add a headline consisting of your current job title and employer.

If you aren’t currently working, or aren’t working in technology, I would be a little more creative here. Are you doing any freelance work, writing, or teaching? If nothing else, write something like “Aspiring Software Engineer.”

Finally, add your summary. This is basically your elevator pitch. Think of one or two sentences that quickly tell an employer what you can do.

Experience & Education

The experience section can be a challenge for someone seeking their first coding job. How can you list experience when you don’t have any? Obviously, you can’t. If you’re currently working outside of the technology world, you should list your job. Even if it’s barista at Starbucks.

Once you land your first technical job remove your earlier experience, unless you can make a compelling case for how it might make you more attractive to a future employer. For example, if you had management experience in a previous, non-technical job, you might want to keep that on your profile.

Include a few sentences or bullet points describing what you accomplished at each of your previous jobs. If you aren’t sure what to write or how to word your accomplishments, try looking at job postings for similar jobs. Recruiters work hard getting the wording just right on their postings. Reuse their work.

You can also add volunteer experience in this section. Many organizations need help building or updating their web site. You can also volunteer to teach coding or other skills in your free time. Code.org is always looking for programmers to volunteer to teach coding to kids in the classroom.

The education section should be straight-forward. If you have a college degree, list it here along with your major field of study. Similarly, if you attended a coding bootcamp, list it here. You can also include relevant online courses such as those at Udemy,Coursera, or freeCodeCamp.

Skills & Endorsements

List skills such as programming languages and frameworks, industry knowledge, and interpersonal skills in this section. If you have demonstrated these skills to other people, ask them to endorse you. An endorsement from someone tells me that you truly have the listed skill.

Recommendations

Personal recommendations can be hard to get. Give recommendations and hope that the recipient will recommend you in turn. I would also directly ask for recommendations from managers and coworkers. Just be aware that asking for a recommendation on LinkedIn can be interpreted as a sign that you’re looking for a new job.

Accomplishments

Use the accomplishments section to highlight things like publications you’ve written, awards you’ve earned, or special projects you’ve completed. This can be a great way to fill out your profile if you don’t have a lot of experience yet. Also, include any foreign languages you speak here.

Go Web App Walkthrough

As a follow up to my previous post, I thought I would walk through the basic web app I built using Go, Gin, and GORM. I realize this might not be the most idiomatic Go program, but I was going for simplicity and brevity.

You can find this code in the file main.go in my goweb GitHub repo.

The Code

Every Go application starts with a package definition. Since this program is an executable and not a library, call this package main. This tells the compiler that the main function, where execution starts, is found here.

package main

Next, import libraries needed by the program. In this case gin and gorm, as mentioned earlier, as well as the GORM postgres driver. The program needs godotenv and os for loading environment variables and finally net/http which defines HTTP status codes.

import (
    "github.com/gin-gonic/gin"
    "github.com/jinzhu/gorm"
    _ "github.com/jinzhu/gorm/dialects/postgres"
    "github.com/joho/godotenv"
    "net/http"
    "os"
)

Next, define the User model. This is a standard Go struct that includes gorm.Model. This adds ID and time stamp fields.

type User struct {
    gorm.Model
    Name string
}

The global variable db is used throughout the program to send queries to the database.

var db *gorm.DB

The usersIndex function is a request handler. It finds an array of users in the database and uses the Gin context to render them as HTML using the index.html template.

func usersIndex(c *gin.Context) {
    var users []User

    db.Find(&users)

    c.HTML(http.StatusOK, "index.html", gin.H{
        "users": users,
    })
}

Finally, the main function, where execution begins.

func main() {

First, load the contents of the .env file. If there’s an error, panic. That is, end the program with an error message.

    err := godotenv.Load()
    if err != nil {
        panic("Error loading .env file")
    }

Now that the environment is loaded, use the DATABASE_URL variable to connect to the database. Again, panic if there is an error. The defer statement calls dB.Close() when this function ends to close the database connection.

    db, err = gorm.Open("postgres", os.Getenv("DATABASE_URL"))
    if err != nil {
        panic(err)
    }
    defer db.Close()

Use the connection to auto migrate the database. This creates the users table, based on the plural version of the model name, with columns named id, created_at, updated_at, deleted_at, and name.

    db.AutoMigrate(&User{})

Next, count the number of users in the database. If there are zero users, create a few examples. You might think of this as seed data.

    count := 0
    db.Table("users").Count(&count)

    if count == 0 {
        db.Create(&User{Name: "Alice"})
        db.Create(&User{Name: "Bob"})
        db.Create(&User{Name: "Carol"})
    }

Create a Gin router using gin.Default().

    router := gin.Default()

Tell the router where to find static files, such as CSS, JavaScript, and images. Also, load the HTML files in the templates directory.

    router.Static("/assets", "./assets")
    router.LoadHTMLGlob("templates/*")

Next, tell the router to handle GET requests to the root path by calling the usersIndex function defined earlier.

    router.GET("/", usersIndex)

Finally, call router.Run() to start handling requests in a loop.

    router.Run()
}

And with that you have a complete program that uses data from a database to render HTML templates in response to web requests.

From Rails to Go

A recent project at work required a high performance API and a light-weight front end. I’ve been looking for an opportunity to explore other languages and frameworks and this seemed like a perfect opportunity. We already have a few services written in Go, so I decided to start there.

The Go Programming Language is a “fast, statically typed, compiled language.” In other words, very different from Ruby (which is dynamically typed and interpreted). In Ruby, everything is an object; Go doesn’t have classes. Yes, very different.

Web

With my choice of programming language set, it was time to find a web framework. After a bit of Googling, I came upon Gin. It promises “performance and productivity” which is exactly what I needed.

Gin features a very fast router for processing requests as well as easy rendering of responses in JSON or HTML. Gin uses the html/template package for rendering dynamic HTML pages (also see text/template for complete documentation).

Database

The next step was database access. I typically work with Active Record, so I started searching for something similar. GORM is an ORM Library in Go. It is full-featured (almost) and developer friendly. Its API is similar to Active Record and it supports all of the features I need including a PostgreSQL driver.

Creating models is easy using GORM. Define the model as a Go struct and include gorm.Model to add ID and timestamp columns. Then pass the model to db.AutoMigrate and GORM will create a table with the required columns automagically. The migration doc has examples of automatic and manual migrations.

Miscellaneous

Since I plan on eventually deploying this application to Heroku, and Heroku uses an environment variable for the DATABASE_URL, I’m also using the Go port of Ruby’s dotenv project named godotenv to load environment variables in development.

Sample App

I posted a complete sample application named goweb on GitHub. Feel free to clone that repo as a starting point for your own creations. The README file includes instructions for setting everything up on a Mac with Homebrew.

If you aren’t on a Mac, you should be able to follow the Go installation instructions and then download PostgreSQL to get the sample app running. Hopefully this will help others looking to explore web development with Go.