Adventures in Ollama (Part 1)
So I was playing around with ollama, and realizing it’s potential I decided to test it out for code generation. I downloaded a bunch of models and gave them all the same prompt; a somewhat detailed description of a PHP text file processing script, which would do a certain task for me and then check its work. I passed the prompt to various models and I got some very interesting results. Here are my observations.
llama3.3:70b
I did the prompt several times on this model. It would take 10 to 20 seconds for the first token then finish a minute and some later. The first time, 95 lines. The second time, 108 lines. The smallest was just 78 lines and the longest was a whopping 114 lines.
All of the Llama 3.3 70b runs produced human-readable JSON. All of them correctly re-produce the file, except on run 4. The fourth time I used this model on the same prompt it forgot to handle the filename/extension properly and did this:
|
1 2 3 4 |
if (isset($inputs['out'])) { $outputFilename = $inputs['out'] . '.json'; $outputTextFilename = $inputs['out'] . '-2.txt'; } else { ... } |
as you can see, if you give it an out=filename.json, it will save to filename.json.json, which is a mistake. I told the model and it responded with this:
|
1 2 3 4 5 6 |
// Check if the 'out' command is provided, otherwise use the filename with .json extension if (isset($inputs['out'])) { $outputBaseName = pathinfo($inputs['out'], PATHINFO_FILENAME); $outputFilename = $outputBaseName . '.json'; $outputTextFilename = $outputBaseName . '-2.txt'; } else { ... } |
..which is also a mistake. It should be more like
|
1 2 3 4 5 |
if (isset($inputs['out'])) { $outputFilename = $inputs['out']; $outputBaseName = pathinfo($outputFilename, PATHINFO_FILENAME); $outputTextFilename = $outputBaseName . '-2.txt'; } else { ... } |
The second problem run #4 had is that it wasn’t reproducing the text file properly. I took a look at what it was doing. First, here’s the working code from an earlier model run (model run #3):
|
1 2 3 4 5 6 7 8 |
// Function to output tokens to a file function output_tokens($objects, $filename) { $text = ''; foreach ($objects as $obj) { $text .= $obj->t; } file_put_contents($filename, $text); } |
And now here’s the code that doesn’t work (from model run #4):
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
// Function to output the tokens to a file function outputTokens($tokens, $filename) { // Open the file in write mode $file = fopen($filename, 'w'); // Write each token to the file foreach ($tokens as $token) { fwrite($file, $token->t . "\n"); } // Close the file fclose($file); } |
It’s very clear that the second version is adding a newline. This is an easy enough fix, but I don’t know why llama3.3:70b did this.
As I was taking to the model to fix these problems, the 20 to 30s wait times till first token started to bother me. But, I still count this as a win since Open-WebUI was providing context for my previous prompts so “Great! Please give me the complete corrected program” worked as expected. Then I looked at the code and found it was 6 lines longer. What? That isn’t right. I looked at what the model had done. It added four lines to validate the filename (a good idea). The other two lines were the whitespace around it. So, no biggie. Even though it took me 5 or 6 minutes to look at the code, and the processing could be considered “slow”, I think llama3.3:70b did an excellent job for this use case.
Interestingly, after complaining about how it saved files, I realized run #4 was the only one that echoed anything from the a1 script; it said “Processing complete. Output saved to d.json and d-2.txt.” Probably as a result of me complaining about the filenames.
Llama 3.3 70b in Retrospect
I was surprised to see that there were so many slight differences each time I ran it. The variable names, the order in which it did things, and certain decisions such as how much refactoring, etc. seemed almost random. For example, in one run it made a function like this:
|
1 2 3 4 5 6 |
function cleanWord($word) { // Remove non-essential characters from the word $cleanedWord = preg_replace('/[^\w\-\'\s]/', '', $word); return $cleanedWord; } |
but in another run, it just called preg_replace() inline (in a loop). In one run it used fopen() and fclose(), in another it used file_put_contents(). One run looked like this:
|
1 2 3 4 5 |
// Calculate analytics $numWords = count(array_filter($objects, function ($obj) { return !isset($obj->whitespace); })); $processingTime = microtime(true) - $_SERVER['REQUEST_TIME_FLOAT']; |
while another looked like this:
|
1 2 3 4 5 |
// Calculate analytics $analytics = new stdClass(); $analytics->num_words = count(array_filter($objects, function($obj) { return !$obj->whitespace; })); $start_time = microtime(true); $processing_time = microtime(true) - $start_time; |
In runs 1 and 4, it included whitespace as (obj->w) as a zero length string. In 2 and 3 it would omit it if it was a zero length string. And in 3 of the runs it would always include the whitespace boolean as true or false; but in the second run alone it only included it if it was true.
Overall, my impression of llama 3.3 70b for coding is that it is good, but the waters are a bit choppy and while it will save you an hour of time writing a script like this, you still need to do 5 or 10 minutes of grunt work on it. But from where I stand, never having used a tool like this before, it is obviously worth an incredible amount to me as a programmer. I can’t see not using it in the future and being successful. I’m not sure if that scares me.
Let’s try llama3.3:70b-instruct-q8_0!
Well, maybe a different version of llama3.3 is better? Again, I ran the prompt four times. It took only slightly longer (30s to 1min) at the slowest till first token and 1 to 2 minutes to finish. The programs were all around the same length, 95 lines here, 107 there, etc.
Runs 1 and 4 produced human readable json like before, but runs 2 and 3 omitted newlines and the file was a giant jumble, very hard to read.
However, surprisingly, none of the runs were able to correctly name the reconstituted story file properly! So I don’t know which models did or did not, but they were in fact correct. Looking at the code the problem appeared that the model just didn’t understand how to name the files and they were all clobbering story-2.txt (except for one which mysteriously saved to st-2.txt). For the four runs I made some slight changes to the naming and pathing and it turns out it was processing correctly. The -instruct model just didn’t understand how PHP used paths and didn’t understand it needed to process the filenames like the llama3.3:70b model did.
Out of the two, I would choose llama3.3:70b.
Filed under: AI,LLM,Ollama,PHP,Programming - @ October 28, 2025 1:38 am