Tech Talk
Mike joins The Dark Side By Michael E. Duffy I ’d like to tell you about my recent trip to The Dark Side. Yes, folks, I’ve finally broken down and actually started to use ChatGPT to write software.
wanted to see the best it can do, which requires coughing up $20 a month for ChatGPT Plus, with access to models which are specifically trained for code generation. Here’s what I initially told it: “Write me a Python program which takes the name of a file in MBOX format and turns that data into a SQLite database. Before you generate any code, we should discuss how to handle potentially large attachments.” It responded with choices for storing attachments: inside the database, on disk or a hybrid approach: small attachments in the database, large attachments on disk. I chose to store each attachment as a file on disk. The database just holds the filename.
My employer Electronic Arts allows us to use it, and provides free access, but I’ve been hesitant to use it on production code. Instead, I’m using it to work on some other projects, one of which is a little project to help me organize my email. All the mail I receive at mike@ mikeduffy.com ends up in my Gmail inbox. You’d be appalled—at this moment, my inbox contains 44 unread emails and a whopping 10,982 messages that I’ve already read in my inbox, dating back to November 2016.
Google tells me I’m using 15.45 gigabytes (81%) of the 19 gigabytes that they give me for free. That 19 GB is spread over Docs, Photos and Drive—but the bulk of it is old email. Google will happily sell me 100 GB of storage for $16.99 annually (or a buck ninety-nine paid monthly), but I’m a cheap bastard. And I know there’s a lot of crap sitting around, taking up space. So, my plan is to save it all into a real database. First, a little background on email. The format of email messages is governed by a handful of Request For Comment documents, RFCs for short. These docs are managed by the Internet Engineering Task Force (IETF), which periodically revises them. The most important two are RFC 5322—the basic Internet Message Format—and RFCs 2045 through 2047, which cover the Multipurpose Internet Mail Extensions, MIME for short. You can Google them up if you like reading lots of “must,” “shall” and “may.” RFC 5322 defines the syntax for text messages sent via email on the Internet. It specifies the structure of email messages, including headers (such as From, To, Subject and Date) and the message body, using a standardized format based on the 256-character ASCII text coding. The message body is where things get complicated. Originally just ASCII text, the message body now commonly contains a MIME document. MIME extends the message body to support multimedia content and internationalization. It allows for different media types (like images, audio or applications) and encodings used to represent non-ASCII data. It also enables emails to contain multiple attachments or embedded content. Finally, it describes the encoding of non-ASCII characters in email header fields, allowing international character sets to be used safely when sending email. Gmail, while great at searching for stuff in my massive inbox, doesn’t really give me the flexibility to sort through my email that I want. My plan is to download all my email (Google provides an easy facility called Google Takeout to do just that— takeout.google.com ) and transfer it to a real database, where I can search and classify it to my heart’s content (or just ignore it, as I presently do). It’s a simple program to write, so I thought I’d see what ChatGPT could do for me. While the free version of ChatGPT (available at ChatGPT.com ) can produce code, I
So I chose that approach and added some instructions. “Approach number 2 seems best. It's consistent across file size. There should be an ‘attachments’ directory, and within it, subfolders for the attachments to each message, named with the message ID. Inside that folder are the attachments for that specific message. What do you think?” It came back with some additional considerations: handling errors, dealing with file names (collisions, illegal characters). I made some minor adjustments, and turned it loose. I probably spent 10 minutes working through the above with my junior programming assistant. 16 seconds later I had a fully working program. Amazing, right? Of course, it had a fairly serious problem: it saved the attachments, but not the message body that generated it. I should have stated in my requirements that I wanted all the information from the original message to be saved, and the program should verify that it can reconstruct the original message in my mailbox data from the generated SQL data. I also want to have it generate the program in TypeScript (currently my go-to language), since my Python is a bit rusty (and Python isn’t type-safe, which makes writing more error prone). So, that’s what I did next. Come back next month for the final installment of my conversion to the Dark Side, and talk about “vibe” coding for non- programmers. g
Michael E. Duffy is a 70-year-old senior software engineer for Electronic Arts. He lives in Sonoma County and has been writing about technology and business for NorthBay biz since 2001.
April 2025
NorthBaybiz 29
Made with FlippingBook. PDF to flipbook with ease