Raspberry_Pi_Education_Manual

Notes:

The permanent storage on a computer (where all the data is kept when the machine is turned off) will usually be larger than the temporary storage (internal computer memory, which is lost when power goes off). On the Raspberry PI, your SD card (permanent storage) is likely to be much larger than the 256Mb of RAM (temporary storage). Look back at the previous experiment that loaded a text file, sorted the entries and then displayed the sorted list. If the file was larger than the fraction of 256Mb that was allotted to Python, how could we sort that file? The Raspberry Pi’s Linux operating system has a virtual memory solution but what if you wanted to keep down the Python memory being used? The answer is to sort part of the list, save that partially sorted list to the permanent storage, and repeat. Once the entire large file has been partially sorted to smaller files, merge all the entries in those files together one by one to make the final file. Very little is kept in memory. Understanding a “generator” is quite difficult, but imagine that it is like a function that generates the elements of a sequence. After it “yields” an element of the sequence, it does nothing until it is called again. So the generator here takes a file, reads a line from it and yields that line to whatever called it. Once it cannot find any more lines, it just stops returning items, which the caller will notice and do something else. You’ve seen a generator already. It was called “range”. The file that is passed to this generator is a subset of the larger file that we want to sort. The first loop creates a lot of temporary files on your permanent storage, each of which contains 10 lines that have been sorted. The first loop doesn’t make use of the sorted partial files immediately. It appends the generator to a list. It is making a list of the generator functions (linesFromFile) that will read lines one at a time from the file. The “heapq” merge function loops through the values given by the generators and merges them in order. It assumes that the elements from the generator will come in a sorted order. So, imagine it takes the initial entries from each partial file until it has worked out which of these is the first. It gives that back, then it continues doing this one by one, only holding a small amount in memory. All that’s left is to do something with each value as it is given. In this case, print it on the display. The clever part here is to use a Python generator and heap queue .

Experiments in Python

107

Made with FlippingBook flipbook maker