Browse Source

Updates

master
Julio Biason 1 year ago
parent
commit
0a60ddec02
  1. BIN
      content/code/overthinking-rust-iterators/fat-iterator.png
  2. 104
      content/code/overthinking-rust-iterators/index.md
  3. BIN
      content/code/overthinking-rust-iterators/normal-iterator.png
  4. 85
      content/code/the-problem-with-progressive-typing-examples.md

BIN
content/code/overthinking-rust-iterators/fat-iterator.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 152 KiB

104
content/code/overthinking-rust-iterators/index.md

@ -0,0 +1,104 @@
+++
title = "Overthinking Rust Iterators"
date = 2023-07-06
[taxonomies]
tags = ["rust", "iterators", "request", "stream"]
+++
I had some issue recently with Rust iterators, and that led me to think *a lot*
about iterators in Rust.
<!-- more -->
What I wanted to do was something not exactly direct in Rust:
- The issue was an external REST API;
- The API returns the data in chunks, providing a paging mechanism;
- The API indicates that there are more data with a `next` field, which either
has the URL for the next page or an empty string if you're in the last page
and there is no more data;
- On my side, I wanted something akin to (which is basically an iterator,
anyway)
```rust
let service = Service(connection_information);
let data = service.data(); // This provides the iterator
while let Some(record) = data.next() {
do_something(&record);
}
```
- The `.data()` iterator would get the first page and start iterating over
those results;
- Once the results were all consumed, if the API informed that there is more
data, the iterator (or *something*) would request more information, adjust
itself for the new data and just keep chugging till all the data was
produced.
Notice that the iterator I want have two sides: One is to spew information from
previous request from memory/cache; the second is requesting (or triggering the
request somewhere) for more data.
# Back to Iterators
Basic iterators work like this:
![](normal-iterator.png "A basic view of an iterator")
... which you have a dataset, create an iterator over them and each call of
`.next()` on it will advance the iterator over the next element of the data and
return a reference to that data; once it reaches the end of data, it returns a
`None`, indicating that there are no more data.
The fun thing about iterators is that they need to hold their own state: Which
is the current element that I'm pointing to? The `.next()` receives a mutable
reference of self exactly due this: It changes its state on each call of
`.next()`.
What I need is, basically, an iterator that does that **and**, once it sees
`None`, retrieves more data and starts over. This raises the question: How does
the iterator gets more data?
# The Fat Iterator Approach
The idea I had was to create a fat iterator that would "hold" its own data and
iterate over it.
![](fat-iterator.png "A fat iterator which has its own data")
Because the data is simply a `Vec<>`, I could do something like:
1. Pull data from service;
2. Update the `data` inside the iterator;
3. Create a new iterator over said `data`;
4. Call `.next()` on the iterator till it turns into `None`;
5. If there is more data, do the request and jump to 2.
If we jump back to the fact that `.next()` updates the iterator internal state,
this means that I'd need to keep the data **and** its iterator in the same
structure. And that causes issues with the borrow checker, 'cause I can't own
part of the data when I own the whole data (yes, it feels like a problem with
the borrow check, but still).
The idea seems solid, except I'd be fighting the borrow checker to a point I'm
not capable yet.
# The "Request Someone Else" Iterator
The other idea I had (but couldn't figure out how it would work) was to,
instead of `service.data()` return an iterator, it would return the data holder
and *that* could create an iterator over itself. The weird thing about this is
that the iterator would have to have a mutable reference to the source data, so
it could call the parent when it reached the end of the data, and the parent
would get a new data source and the iterator would "reset itself" after calling
it -- which sounds more complex than it should.
(I could also make the parent holder have a `Cell<>` over data to have just
internal mutability over it, but again, sounds more complex than it should).
# The Solution
Sorry, no solution (yet). I'm still tinkering with it and I'll update this
once I find something that works and it doesn't require two (or more) things
(mutably) interacting between themselves.

BIN
content/code/overthinking-rust-iterators/normal-iterator.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

85
content/code/the-problem-with-progressive-typing-examples.md

@ -0,0 +1,85 @@
+++
title = "The Problem With Gradual Typing Examples"
date = 2023-07-12
draft = true
[taxonomies]
tags = ["programming languages", "typing", "static typing", "progressive typing"]
+++
[Gradual Typing](https://en.wikipedia.org/wiki/Gradual_typing) offers the mixed
results of static typing and dynamic typing: You can let the
compiler/interpreter fight with the types in the runtime when you don't specify
them, but you can say the type of a variable and the compiler/interpreter will
pick incompatible type operations before it runs.
But the examples I've seen always pull things in, in my option, the wrong way.
<!-- more -->
The general example of a bad thing for gradual typing is this type of code:
```
def add(a, b):
return a + b
```
The issue is that there are multiple values that can be added. If we consider
just the primitive types of Python, we would get something like:
```
from typing import Union
CanAdd = Union[str, float, int]
def add(a: CanAdd, b: CanAdd) -> CanAdd:
return a + b
```
This code obviously breaks on other things can be added. For example, you can
create an class, override its `add`/`+` operator and, thus, objects of said
class can be added. But because our just-primitives types doesn't list our new
class, the compiler/interpreter would never accept a call of `add()` with those
objects.
(There is another issue, sometimes cited, sometimes don't, about the fact that
there is nothing saying that the type of `b` must be the same type of `a` and
thus one could add an integer to a string, which is... wrong.)
But I have a question:
**Is that a real world kind of function?**
Oh, don't get me wrong: Simpler functions, with just as few lines is quite
normal. Something that would be *real*[^1] would be something like:
```
def item_total(qty: int, price: float) -> float:
return qty * price
```
I guess this is more common than `add()`, because it is an operation that
happens most of the time. And this would get the case of
```
def receive_api_request(json_data):
...
for item in json_data['items']:
total += item_total(item['price'], item['qty'])
```
If you don't type check, this would produce exactly the same results. But the
code is wrong, and if someone do a small change in `item_total()`, you'd end up
with a strange thing in which only this call produces the wrong results, while
all other interfaces would (probably) produce the correct result.
{% note() %}
Surely there are issues, nonetheless. Python, which I used in this example,
would still not accept anything that could be automatically coerced to int,
even if such interface doesn't exist, although they are working on support
types with "protocols".
{% end %}
---
[^1]: For different values of "real".
Loading…
Cancel
Save