GC is great from the point of view, that a programmer doesn't have to think about freeing variables - they do it by themselves. However, it brings performance penalties (and so-called "spikes" in terms of latency for web-applications). There are already Rust with its lifetimes, C++ with RAII and smart pointers, etc. However, they require programmer to process them manually. What about mixing both ways: compiler derives lifetimes automatically? Then it feels like a GC, but no actual GC is used.
Let's introduce the main point of the mechanism: hierarchy. It represents a structure of a type. Hierarchy consists of layers, where each layer is a container. For example: - Fundamental types are containers for primitives, that doesn't contain any other layesr. For example: int, float, bool, string, etc. - Complex types are containers, that may contain other types internally. For example: arrays, hashmaps, classes, etc. Each hierarchy may consist of at least 1 container (for primitive types. Upper-bound isn't defined. Hierarchy's first layer is always a top-level container, that contains all other nested types. Wrapping some hierarchy into array, map, etc. results in a new hierarchy with wrapper at the top. Each layer of a hierarchy responds for the whole hierarchy. This means, that if any of levels is still in use, the whole hierarchy will exist.
If value is a pretender for allocation, first let's build a graph of adventure of the value. Detect death's scope - a scope where the value is used for the last time. In specific case, when origin scope is also a death's scope - avoid allocating on heap, pass pointer onto a stack instead.
-
Returning random string from the array:
fn greet() str { greets := ["Hello", "Hey", "Howdy"] return random.choice(greets) } fn main() { print(greet()) }It's interesting, as we've got the array, that dies just after the greet() exits. The problem is, that it contains strings, that also needs to be freed, however they are being returned.
At the point of defining the array, we've got a hierarchy consisting of 2 layers: array -> string. We detect, that belonging to the array string is returned, so by that we are keeping the whole hierarchy. Only on exit from main (which is a death's scope) it'll be freed
-
Producing multiple allocated pretenders:
fn greet() str { positive := ["Hello", "Hey", "Howdy"] negative := ["Who asked", "Average L", "Don't talk to me"] if random.random() > 0.5 { return random.choice(positive) } return random.choice(negative) }Here we've got two arrays, and undetermined state, where we can't definetely know, which hierarchy will be returned. This is called "multiple escaping pretenders", and we really can't handle it on compile-time. So the solution is, just to collect all the pretenders available, and on each return free them all, except the returned one.
Receiver-scope won't mind about which hierarchy was actually returned, as its type is similar - origin of all the layers doesn't play any role. Analysis rule works for the receved hierarchy the same as it was born in receiver scope
-
Returning CT-nondetermined value
fn nums() [int] { nums := [1, 2, 3] if random.random() > 0.5 { nums = [4, 5, 6] } return nums }We don't actually know, which concrete value are we gonna return. The problem is, that both are independent type hierarchies, and both may be returned via a single return statement. For this case we can mark all the return candidates, and bind their type hierarchies (make them dependant). So by that, no matter which candidate will actually be returned, but they all will be freed simultaneously.
-
Let's see a case, where we return a tuple:
fn pairOfGreets() (str, str) { positive := ["Hello", "Hey", "Howdy"] negative := ["Who asked", "Average L", "Don't talk to me"] fst := random.choice(positive) snd := random.choice(negative) if random.random() > 0.5 { snd = random.choice(positive) } return (fst, snd) } fn main() { fst, snd := pairOfGreets() print(fst) print(snd) }Here we've got an undetermined at compile-time state, in which we cannot definetely know whether will hierarchy of negative leak or not.
The algorithm has problems related to space leaks. Explicit example of such is a case, when we allocate an array of million strings, and then return some string from it. Theoretically, we don't use the array anymore, so we can free it just after we exit from a function. However, it can't as the string is still attached to it. So by that, lifetime of the array of million objects is equal to a lifetime of a single string. This may cause several problems, if function never returns and the array consumes significant amount of memory.
There are multiple propositions to extend and/or modify the algorithm:
Instead of freeing a value on exit from a death's scope, we can do it just after the last usage of the value. This makes the process of giving unused heap-allocated back a bit faster, and may be handy in cases where between last usage of the value and exit from a scope there's some code (for example, IO), execution of which takes a long time.
This fixes space leaks, related to arrays (as described in Perculiarities). The point is to detach some type from its type hierarchy, making it independent. So by that, the array can be freed on exit from a function, including its content. However, in order to avoid freeing the one returned string, we must check each element, and this has some costs, sometimes even significant (for example, non-inlined overloading comparator).