Browse Source

converted the remaining posts

master
Julio Biason 6 years ago
parent
commit
aaa19047ea
  1. 59
      content/agile-vs-culture.md
  2. BIN
      content/agile.jpg
  3. 238
      content/couchbase-example-and-rest.md
  4. 53
      content/dead-github-maintainers.md
  5. 39
      content/juliobiason.net-3.0.md
  6. 114
      content/mocking-a-mock.md
  7. 60
      content/pre-order-the-case-of-no-mans-sky.md
  8. 312
      content/python-2-3-six.md
  9. 102
      content/the-day-i-found-my-old-code.md
  10. 43
      content/the-sad-life-of-walter-mitty.md
  11. 66
      content/when-i-used-pep8-to-fuck-up-code.md

59
content/agile-vs-culture.md

@ -0,0 +1,59 @@
+++
title = "Agile vs Culture: The Story of Outliners"
date = 2015-12-18
category = "thoughts"
[taxonomies]
tags = ["agile", "book", "empowerment", "disenfranchise", "en-au"]
+++
When the culture goes againt agile.
<!-- more -->
![The Agile cycle](/agile.jpg)
In some recent agile conferences I went this year, I've been recalling and
telling one story from
[Outliners](https://www.goodreads.com/book/show/3228917-outliers)
(which I wrongly assumed it was part of
[Freakonomics](https://www.goodreads.com/book/show/1202.Freakonomics)
about the number of accidents in Asian and South American
airlines. The book points that there is a cultural difference between those
two and American people, in which the former see a larger distance between
them and their superiores than the later.
Why I keep recalling this? Because in agile teams, there is no hierarchy: the
PO is as important as the junior developer; the tester has the same input
value as the senior developer. This means that the team doesn't need to wait
for someone higher in the chain to make a decision: the team is free to make
their own decisions on how to better reach the value requested by the PO.
In all events I went, there is a constant problem on "how do I make my team
see the value in Agile" and "why Agile doesn't work". Again, it seems that
Agile goes straight against the cultural reference South Americans -- in this
case, me and my colleagues -- because we are cultural trained about that guy
who is in a higher place in the chain and, thus, I depend on him on the
important questions (for whatever value of "important" I believe a solution
is). 
In the end, it's not as much as changing a company development model and
explaining to managers and directors on how the software -- and its value
-- will be delivered, but fighting against the cultural norm of having
someone in a very high place that can make decisions while people think they
are very low in the chain to make a decision. Not counting the constant fear
of being wrong (which is actually good in agile).
The problem revolves not only on this point, but also in the assumed position
based on role name. Someone will assume that because their position is
"developer", it means that they are below -- and receive orders from --
the PO; someone will assume that because someone's else role is tester and
their are designed as developer, they are up in hierarchy and, thus, can order
the tester to do whatever they think it must be done.
Here we have a second problem: we need to detect and empower those who think
they are below in the chain and "disenfranchise" those who think they are
above everyone else due the role name.
My plan for 2016 is to read some books about those topics and bring this
dicussion to future events. Which me luck. ;)

BIN
content/agile.jpg

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

238
content/couchbase-example-and-rest.md

@ -0,0 +1,238 @@
+++
title = "Couchbase Example and REST"
date = 2016-01-12
category = "code"
[taxonomies]
tags = ["rest", "couchbase", "example", "restful", "en-au"]
+++
Using the example Couchbase to show how REST works.
<!-- more -->
Let me start this by pointing that I'm a RESTnazi: I'm the kind of guy that
will get into a fight with anyone that says things like "Ok, that's because
this is just REST, not RESTful" because... well, because, there is no
diference between REST and RESTful.
And today I found something weird while reading
[the Couchbase documentation](http://developer.couchbase.com/documentation/server/4.1/travel-app/travel-app-walkthough.html)
with them claiming that their example is REST while... well, it isn't.
But hey, that's a good opportunity to explain a bit what is REST (and what is
not).
## What is REST?
REST is an architecture/design pattern/pick your buzzword built on top of HTTP
to provide information. It has two components:
* **Resources**: That's the elements in your system: Your users, your books,
your airports, your flights and such.
* **Verbs**: Those are the things you do with your resources: You GET them,
you UPDATE them, and so on.
There is no true "guideline" on how to write resources. It's usually done with
nouns in their plural form (or, at least, that's what [Apigee](http://apigee.com/about/)
concluded after checking a bunch of APIs around). Those resources are mapped
through URLs with some base.
Let's pick the example from Couchbase: It's a travel app, with airports,
flights and flight paths. We could use a base URI scheme of
`/travel/api/v1.0/` because:
1. The travel app could also provide a user interface through `/travel/`, so
we keep the API endpoint on `api` to not mix things.
2. We are versioning the API (here, v1.0). This is a recommendation from
Apigee and, again, not part of the architecture/design patter/buzzword.
On the top of this base URI, we'll build our resource URLs:
* `/travel/api/v1.0/airports/` and
* `/travel/api/v1.0/flights/`
"Where is the flight path endpoint?", you must be asking? Well, I'll tell you
later about it, hold on a second, but we'll use those two to explain the very
basic of REST first, ok?
Besides those two URIs, we need two more: One for each resource to access
direct elements. So now we have:
* `/travel/api/v1.0/airports/`;
* `/travel/api/v1.0/airports/{airport_id}`;
* `/travel/api/v1.0/flights/` and finally
* `/travel/api/v1.0/flights/{flight_id}`.
So, now that we have our resources, we need ways to manage their contents. For
this, we use the "verbs" I mentioned before. The thing about rest is that
those actions are directly tied to the HTTP verbs:
* **GET** will retrieve elements in the resource;
* **POST** will insert a new element in the resource;
* **PUT** is used to update the information of an element [#put]_;
* **DELETE** is used to remove an element from the resource [#delete]_.
{% note () %}
If you want an easy mnemonic, "PUT" has and "U", for "update". Yes, it's silly,
but it works (at least, for me). Also, a "PUT" directly on a resource means
"replace the whole database with this information" and, thus, not not really
widespread.
{% end %}
{% note() %}
You can add a DELETE for your whole resource, if you're crazy and bold enough.
{% end %}
And adding those two we have:
* Get a list of all airports: `GET /travel/api/v1.0/airports/`
* Add a new airport: `POST /travel/api/v1.0/airports/`
* Get information of a single airport: `GET /travel/api/v1.0/airport_3577`
* Update the information of an airport: `PUT /travel/api/v1.0/airport_3577`
... and so on.
Easy as pie, right?
## The "Flight Path" resource
Now let's go back to the "flight path" resource, which I left behind. Thing
is, a flight path does not exist on its own. If a flight doesn't exist, the
flight path doesn't exist either, right? And if I flight exist, it should have
a path, right?
So a flight path is a resource linked directly to our resource of flights. For
this, REST allows resource chaining by just adding another layer on top of
existing URIs. As we pointed before, a flight path **needs** a flight (a
flight *element*, just to make more clear where I'm going for with this), so
we should build the resource on top of an element URI:
* `/travel/api/v1.0/flights/airline_24/paths` and
* `/travel/api/v1.0/flights/airline_24/paths/{path_id}`
... although the last one only makes sense if a flight could have two (or
more) different paths, which would make sense if it goes one way in a path and
goes back in a different path, which I do not know enough about flights to
know if this is possible, but for the sake of explaining everything about
REST, let's go with it, mkay?
And now you may be wondering: Why not simply do
`/travel/api/v1.0/flightpaths/{path_id}`? Again, because flight paths are tied
to flights, this means the base resource for the flight won't even exist and,
thus, it's sub-resources won't be available, which makes a lot of sense.
## Filtering results
Ok, now we know how to retrieve all airports, which is nice, but we don't want
them all: the user will type something and we'll show them only the airports
that match their search. We could screw the user and send the whole list to
them and let the application filter it locally, abusing the user bandwidth and
CPU power -- which isn't nice, since we have a database on our side that can
do this filtering faster.
Because we can use URIs only to point to resources and resource elements, we
need a different way of passing this to the server. And guess what? HTTP have
the proper way to do this: querystrings and forms.
Querystrings, for those unfamiliar with HTTP, are the things can come after
the "?" in the URL. For example, in the URL:
"`http://example.com/sayname?name=julio`", "`name=julio`" is the querystring.
It provides a key ("name") and a value ("julio"). Forms are basically the
same, but instead of being part of the URL, they are sent in the body of the
HTTP request (and can be much, much larger than querystrings).
There is one more thing about querystrings and forms: The only way to send
information to the server in a `GET` request is through querystrings, since
GETs do not have a body. DELETEs can have a body, but the RFC says it should
be ignored. POST and PUT do have bodies and, thus, information about the
element to be added/updated should come in there.
So, for filtering, we could have a "filter" querystring to filter elements.
Couchbase filters airports with a single querystring, so we could simply do
`GET /travel/api/v1.0/airports/?filter=<user input>`
So the user will see a bunch of airports with their input. And, since we have
all the airport, we could also link the flights as a subresource of it, with:
`GET /travel/api/v1.0/airports/<airport_id>/flights/`
... which we didn't mention before, but it is now making sense, right?
Couchbase example also allows showing which flights connect two airports and
the REST way is, again, using querystrings:
`GET /travel/api/v1.0/airports/<airport_id>/flights/?connectedTo=<airport_id>`
And, if you want to be nice enough, you could even add a "fields" parameter,
so your API consumers could filter out fields they don't want in the results,
to reduce the bandwidth required. But it's all up to you.
Weird how things make absolute sense here, and we never called the "flights"
resource, right? That's one of the things about REST: you build resources in a
way that make sense for the **consumer** of the API, not to reflect your
database.
## Pagination
Just for the sake of completeness, let's talk a bit about pagination.
Pagination, in REST, works for getting all the elements in the resource, so
it's used in the GET request for the resource. And, because it's part of the
GET request, it should come in the querystring.
There are a couple of ways of doing pagination, in this case:
* Let the consumer specify page size and page count: In this case, you could
have a query string like `count=15&page=2` to retrieve the elements from the
second page of 15 elements each. This is the most common way of doing
pagination and Twitter is one good example of this.
* Have a hardcoded pagesize: Same as before, but the only option available is
`page=2`.
* Have the consumer specify the last seen element and page size. So the first
request would have something like `count=15` to retrieve the first 15
elements, but the next request would have the last element in the list as a
parameter, like `count=15&lastSeen=16` and the server would return all
elements that come after the element with id "16". This prevents duplication
in the results in case a new element is added. Reddit uses this in their
API.
## The type of response
Again, for the sake of completeness, you may have noticed that not even once I
mentioned the type of data to be returned in each step. That's because REST
does not have a format: You could build a whole service that returns HTML
pages in REST format, and that's ok; you could return JSON, which the
Couchbase documentation points correctly that it is the most widely used
format; you could return XML; if you're crazy enough and want to return in
COBOL format, go for it!
## So, where the example fails to be REST?
1. All paths are marked with "findAll". "findAll" is **not** a resource and,
thus, shouldn't be in the URL.
2. As I pointed, flight paths are actually a sub-resource of flights and
should be linked. Flight paths should **not** exist if the flight doesn't
exist.
The flight path query uses querystrings to retrieve the information for paths
that go through two airports, which is the right way of doing, but again, it
shouldn't be a resource on itself.
## How to fix the documenation
Easy way? Remove the "REST" mention in the pages. I *am* nitpicking the word
"REST" there, I fully reckonize it, and I understand that for the sake of
example it doesn't have to be REST, but it seems wrong to tell people
something is REST when it isn't.
If Oracle decided to say "we added a field type that can store huge amounts
of JSON data, and although you can't query its content, we can now say
OracleDB is a NoSQL database", people would lose their minds. But that's kinda
like I'm feeling about this whole thing.

53
content/dead-github-maintainers.md

@ -0,0 +1,53 @@
+++
title = "Dear Github Maintainers"
date = 2016-01-15
categories = "code"
[taxonomies]
tags = ["github", "comments", "en-au"]
+++
A rebuttal to "Dear Github".
<!-- more -->
So recently in Reddit, there is this thread going around about
[Dear Github](https://github.com/dear-github/dear-github),
which points some problems with Github issues pages.
Thing is, most of the problems are not problems with Github itself, but by the
community that grew around it.
For example, the most annoying one is the huge amount of "+1" in comments. I've
seen this and yes, it's annoying as hell. Lots of people come around and post
a simple "+1" instead of really contributing. This is *not* an issue with
Github, it is an issue with the community that instead of helping fixing a
problem, thinks that posting "+1" to point that it is important to them is
actual help. It isn't. I've seen issues with so many "+1" that if everyone
who posted a "+1" actually submitted a single change, the bug would be fixed
with spare lines.
(Unpopular opinion: Github should have support for "+1", but actually *ban* it.
It is unhelpful. If it's important to you, you should at least give a try to
fix the issue instead of "+1" and giving yourself a pat in the back for
"helping out".)
Issues missing important information surely is a problem, but that's why you
need to triage your issues. Is there any missing information? You can reply to
the poster. "But why should I ask when I can put a form for the user to fill
issues?" Dude, seriously? You're worried that you will lose 30 seconds of your
life to ask something? Why don't you want to talk to your community, why you
don't want to teach people how to properly report errors? Is it that hard to
be part of a community?
But the hurting point is the "if Github was open source, we would fix this
ourselves". [Gitorious](https://en.wikipedia.org/wiki/Gitorious) was open
source and never had that much contribution from the community, to the point
it was closed and moved to Gitlab. So I have to ask: If Bitbucket implemented
this, would all of you move to it? My guess is an indignant "No", because
Github means exposure while all the other public Git sites are not.
To me, the whole list is not a list of problems with Github itself, but a
problem with the open source (in the general, broad term) community that's
growing around Github. We should worry about building communities, not building
code with 400 forks, 1000s of "+1" comments and a single maintainer.

39
content/juliobiason.net-3.0.md

@ -0,0 +1,39 @@
+++
title = "Announcing JulioBiason.Net 3.0"
date = 2015-02-18
category = "announcements"
[taxonomies]
tags = ["meta", "blog", "pelican", "en-au"]
+++
Short version: New blog URL, engine and layout.
<!-- more -->
Long version: For a long time already, I've been thinking about using a static
blog generator. Not that there is anything wrong with dynamic blog engines (and
I'm a long time [WordPress](https://wordpress.org/) user, without any issues,
specially since my hosting company -- [Dreamhost](http://www.dreamhost.com/) --
offers easy updates), but... I don't know, I think it's easy to automate some
stuff when all you have are basic files, with no API to talk to.
So, here it is. A new blog URL, so all old posts are still visible in their
original paths (although this will be a problem in the future when I decide to
launch a 4.0 blog, but that's a problem for the future); a new engine, as
WordPress is not static, so I decided to go with
[Pelican](http://blog.getpelican.com/), simply because I know Python (I know
there is a huge community for [Jekyll](http://jekyllrb.com/), but I'm not a
Ruby guy and I don't want to be a Ruby guy); and finally a new layout, as I
took everything I've been playing with [Zurb
Foundation](http://foundation.zurb.com/) and, since I'd automagically gain a
responsive layout, I did just that. And yes, the
[theme](https://bitbucket.org/juliobiason/pelican-fancy-foundation) is my
creation -- and that's why there is a bunch of broken stuff. I'll be fixing
them in the future, as I see them -- or someone reports them to me.
PS: There is actually a hidden thing, some [things I don't want to deal
again](http://juliobiason.net/2008/02/23/why-half-life-2-failed/), which could
probably crippling me in what to write (hence why the content was so dull and
boring in the last few months). Because static blogs don't have comments, I may
feel fine in finally discussing them.

114
content/mocking-a-mock.md

@ -0,0 +1,114 @@
+++
title = "Mocking A Mock"
date = 2016-07-21
category = "code"
[taxonomies]
tags = ["python", "mock", "mongodb", "find", "count", "en-au"]
+++
Mocks are an important part of testing, but learn how to properly mock stuff.
<!-- more -->
A few weeks ago we had a test failing. Now, tests failing is not something
worth a blog post, but the solution -- and the reason it was failing -- is.
A few background information first: The test is part of our Django project;
this project stores part of the information on MongoDB, because the data is
schemaless -- it comes from different sources and each source has its own
format. Because MongoDB is external to our project, it had to be mocked
(sidenote: mocks are there exactly to do this: the avoid having to manage
something external to your project).
PyMongo, the MongoDB driver for Python, has a `find()` function, pretty much
like the MongoDB API; this function returns a list (or iterator, I guess) with
all the result records in the collection. Because it is a list (iterator,
whatever), it has a `count()` function that returns the number of records. So
you have something like this:
```mongodb
connector.collection.find({'field': 'value'}).count()
```
(Find everything which has a field named "field" that has a value of "value"
and count the results. Pretty simple, right?)
The second hand of information you need is about the `mock` module. Python 3
has a module for mocking external resources, which is also available to Python 2.
The interface is the same, so you can
[refer to the Python 3 documentation](https://docs.python.org/dev/library/unittest.mock.html)
for both versions.
An usage example would be something like this: If I had a function like:
```python
def request():
return connector.collection.find({'field': 'value'})
```
and I want to test it, I could this:
```python
class TestRequest(unittest.TestCase):
@patch("MyModule.connector.collection.find")
def test_request(self, mocked_find):
mocked_find.return_value = [{'field': 'value', 'record': 1},
{'field': 'value', 'record': 2}]
result = request()
self.assertDictEqual(result, mocked_find.return_value)
```
Kinda sketchy for a test, but I just want to use to explain what is going on:
the `@patch` decorator is creating a stub for any call for
`MyModule.connector.collection.find`; inside the test itself, the stub is
being converted to a mock by setting a `return_value`; when the test is run,
the mock library will intercept a call to the `collection.find` inside
`MyModule.connector` (because that module imported PyMongo driver to its
namespace as `connector`) and return the `return_value` instead.
Simple when someone explains like this, right? Well, at least I hope you got
the basics of this mocked stuff.
Now, what if you had to count the number of results? It's pretty damn easy to
realize how to do so: just call `count()` on the resulting list, or make it
return an object that has a `count()` property.
The whole problem we had was that the result of `find()` was irrelevant and
all we wanted was the count. Something like
```python
def has_values():
elements = connector.collection.find({'field': 'value'}).count()
return elements > 1
```
First of all, you can't patch `MyModule.connector.collection.find.count`
because you'll only stub the `count` call, not `find`, which will actually try
to connect on MongoDB; so the original patch is required. And you can't patch
both `find` and `count` because the first patch will return a new `MagicMock`
object, which will not be patched (after all, it is *another* object). The
original developer tried to fix it this way:
```python
mocked_find.count.return_value = 0
```
... which, again, doesn't work because the call to `find()` will return a
`MagicMock` that doesn't have its `count` patched. But the developer never
realized that because `MagicMock` tries its best to *not* blow up your tests,
including having return values to conversions like... int. And it will always
return 1.
Is your head spinning yet? Mine sure did when I realized the whole mess it was
being made. And let me repeat this: The problem was *not* that MongoDB was
being mocked, but that it was being *mocked in the wrong way*.
The solution? As pointed above, make `find` return an object with a `count`
method.
```python
count_mock = MagicMock(return_value=0)
mocked_find.return_value = MagicMock(
**{'count': count_mock})
```

60
content/pre-order-the-case-of-no-mans-sky.md

@ -0,0 +1,60 @@
+++
title = "Pre-Orders: The Case of No Man's Sky"
date = 2016-08-25
category = "thoughts"
[taxonomies]
tags = ["pre-order", "grim dawn", "no man's sky", "en-au"]
+++
[No Man's Sky](http://www.no-mans-sky.com/) is getting a lot of heat recently
because, well, the game is not all what the developers promised. And a lot of
people are putting the blame on pre-orders and whatnot.
<!-- more -->
Thing is, this is not a problem with pre-orders. This is a problem with a
development company not getting up with the times.
Example: [Grim Dawn](http://www.grimdawn.com/). Although not a pre-order thing
per-se, the game was on a Kickstart. Today, the game is polished, fun, have
lots of stuff to do but nowhere there is someone claiming this "pre-order"
thing ruined the game.
The difference between Grim Dawn and No Man's Sky is that Crate, the
developers of the first, continuously delivered versions to get feedback.
Falling through the world? Ok, we can fix. Game doesn't run on your rig even
when you have the minimal specs? There is something wrong with our engine.
That feature? Yeah, it's too big for now, we'll work on it later.
Not on this list, but ArenaNet did something close to that with Guild Wars 2:
People who pre-purchased the game -- an "extreme" version of pre-order --
could participate on the closed beta events. Those events, although not
spawning through whole maps, would allow players to experience some part of
the game and return feedback. They would even claim "we just want to stress
the servers, so weird things could happen" and people were fine with that.
Hello Games, on the other hand, did all development behind closed doors. Sure
they are a small company, but there was nothing stoping them from actually
doing some open beta test or whatever to receive feedback. Well, except on
thing: Sony.
Sony injected money on Hello Games for their first title (I was about to claim
"a lot of money" but heck if I know how much they funded) and wanted it on
their console. Now, consoles do not have a "here, play for testing" or "signup
for this and we'll add your console ID in our database and you can download
the game". To keep the things hyped, no one could see the game before release.
No previews, no betas, no nothing. Feature wasn't fun? Who would know, it's
scrapped now. The engine blows up on certain configurations? Only way to check
this is after the final release.
So, again, it's not a problem with "Pre-orders are bad and you should feel
bad". This is a problem with a company not keeping up with the times. A lot of
companies are now sharing things beforehand to get larger feedback than their
friends and family: Microsoft is continuously releasing Windows versions
through their "Microsoft Insider" program, which anyone can join; Apple is
giving betas of all their OSes for anyone who wants to test them. The idea of
"many eyes makes all bugs shallow" finally caught up and people realised it
was right.
But, apparently, Hello Games + Sony didn't.

312
content/python-2-3-six.md

@ -0,0 +1,312 @@
+++
title = "Python 2 + 3 = Six"
date = 2016-11-21
category = "code"
[taxonomies]
tags = ["python", "six", "python 2", "python 3", "tchelinux", "pt-br"]
+++
"Six" é uma pequena biblioteca Python que pode ajudar você a passar o seu
código de Python 2 para Python 3.
<!-- more -->
{% note() %}
(Esse post é relacionado com a apresentação que eu fiz no dia 19 de novembro
no TchêLinux. Os slides podem ser encontrados
[na área de apresentações](http://presentations.juliobiason.net/python23six.html).)
{% end %}
Antes de mais nada, uma coisa que precisamos responder é: Porque alguém usaria
Python 3?
* Todas as strings são unicode por padrão; isso resolve a pilha de problemas
macabros, chatos, malditos, desgraçádos do `UnicodeDecodeError`;
* `Mock` é uma classe padrão do Python; ainda é possível instalar usando `pip` e
a sintaxe é exatamente igual, mas é uma dependência a menos;
* `Enum` é uma classe padrão do Python; Enum é um dos abusos mais
interessantes de classes em Python e realmente útil;
* AsyncIO e toda a parte de lazy-evaluation que o Python 3 trouxe; muita coisa
no Python 3 deixou de ser "gerar uma lista" para ser um retorno de um
iterador ou um generator; com AsyncIO, tem-se um passo a frente nessa idéia
de geração lazy das coisas e, segundo pessoas mais inteligentes que eu, com
PyUV, o Python consegue ser tão ou mais rápido que o Node;
* E, principalmente, **o suporte ao Python 2 termina em 2020!**
{% note() %}
Existe ainda a interpolação de strings com o novo identificador `f`; a
funcionalidade é semelhante à chamada `str.format` usando `locals()`, por
exemplo, `f'{element} {count}` é equivalmente à `'{element}
{count}'.format(locals())` (desde que você tenha `element` e `count` como
variáveis locais da sua função).
{% end %}
O último ponto é o mais importante. Você pode pensar "mas ainda tem três anos
até lá", mas natal está chegando, daqui a pouco é carnaval e, quando menos se
espera, é 2020.
## O caminho para Python 3
Quem quiser já começar a portar seus aplicativos para Python 3, existem duas
formas:
A primeira é executar seus aplicativos com `python -3 [script]`; isso irá fazer
com que o interpretador Python avise quando qualquer instrução de código que
ele não consiga converter corretamente seja alertado. Eu executei um script
pessoal [com data de 2003](https://bitbucket.org/juliobiason/pyttracker) e o
Python não apresentou nada.
{% note() %}
Apenas para fins de melhor elucidação: o código que eu estava gerando já estava
mais correto e seguindo os padrões mais pythônicos; em 2014 eu ainda estava
vendo casos em que código rodando em Python 2.6 ainda usava `has_keys()`, que
foi deprecado no Python 2.3.
{% end %}
Existem vários motivos pra isso:
1. As pessoas se acostumaram a escrever código "Pythonico"; a linguagem em si
não sofreu grandes alterações.
2. Apesar da linguagem Python ter algumas coisas removidas, essas foram
lentamente reintroduzidas na linguagem; um exemplo é o operador de
interpolação de strings (`%`) que havia sido removido em favor do
`str.format` mas acabou voltando.
A segunda forma para portar seu código para Python 3 é usar a ferramenta
`2to3`. Ela irá verificar as alterações conhecidas para Python 3 (por exemplo,
a transformação de `print` para função, a alteração de alguns pacotes da STL)
e ira apresentar um patch para ser aplicado depois.
Entre as conversões que o `2to3` irá fazer, está a troca de chamadas de
`iter`-alguma-coisa para a versão sem o prefixo (por exemplo,
`iteritems()` irá se tornar simplesmente `items()`); `print` será
convertido para função; serão feitos vários ajustes nas chamadas das
bibliotecas `urllib` e `urlparse` (estas duas foram agrupadas no Python 3
e a primeira teve várias reorganizações internas); `xrange` passa a ser
`range`; `raw_input` agora se chama `input` e tem um novo tratamento de
saída, entre outros.
Existe apenas um pequeno problema nessa conversão de Python 2 para Python 3:
Como pode ser visto na lista acima, alguns comandos existem nas duas versões,
mas com funcionalidades diferentes; por exemplo, `iteritems()` é convertido
para simplesmente `items()`, mas os dois métodos existem em Python 2: o
primeiro retorna um iterador e o segundo retorna uma nova lista com as tuplas
de todos os elementos do dicionário (no caso do Python 3, é retornado um
iterador). Assim, apesar do código ser gramaticalmente igual tanto em Python 2
quanto Python 3, semanticamente os dois são diferentes.
Esse problema de "comandos iguais com resultados diferentes" pode ser um
grande problema se o sistema está sendo executado em ambientes que não
permitem modificação fácil -- por exemplo, o mesmo é executando num Centos 4
ou ainda necessita compabilidade com Python 2.6, ambos "problemas" sendo, na
verdade, requisitos do grupo de infraestrutura.
## Six (e `__future__`) ao Resgate
Para resolver o problema de termos código que precisa executar nas duas
versões, existe a biblioteca [Six](https://pythonhosted.org/six/); ela faz o
"meio de campo" entre Python 2 e Python 3 e fornece uma interface para que
código Python 2 seja portado para Python 3 mantendo a compatibilidade.
Num exemplo (relativamente idiota):
```python
import collections
class Model(object):
def __init__(self, word):
self._count = None
self.word = word
return
@property
def word(self):
return self._word
@word.setter
def word(self, word):
self._word = word
self._count = collections.Counter(word)
@property
def letters(self):
return self._count
def __getitem__(self, pos):
return self._count[pos]
if __name__ == "__main__":
word = Model('This is an ex-parrot')
for letter, count in word.letters.iteritems():
print letter, count
```
Nesse exemplo, temos uma classe que guarda uma frase e a quantidade de vezes
que cada letra aparece, utilizando `Counter` para fazer isso (já que `Counter`
conta a quantidade de vezes que um elemento aparece em um iterável e strings
*são* iteráveis).
Nesse exemplo, temos os seguintes problemas:
1. `class Model(object)`: em Python 3, todas as classes são "new class" e o
uso do `object` não é mais necessário (mas não afeta o funcionamento da
classe);
2. `for letter, count in word.letter.iteritems()` Conforme discutido
anteriormente, `iteritems()` deixou de existir e passou a ser `items()`;
`items()` existe no Python 2, mas a funcionalidade é diferente. No nosso
caso aqui, o resultado da operação continua sendo o mesmo, mas o consumo de
memória irá subir cada vez que a chamada for feita.
3. `print leter, count`: `print` agora é uma função e funciona levemente
diferente da versão com Python 2.
Então, para deixar esse código compatível com Python 2 e Python 3 ao mesmo
tempo, temos que fazer o seguinte:
> `class Model(object)`
Não é preciso fazer nada.
> `print letter, count`
```python
from __future__ import print_function
print('{} {}'.format(letter, count))
```
`print` como função pode ser "trazido do futuro" usando o módulo
`__future__` (apenas disponível para Python 2.7); como a apresentação de
várias variáveis não é recomenando usando-se vírgulas, usar o
`str.format` é a forma recomendada.
Uma opção melhor (na minha opinião) é:
```python
from __future__ import print_function
print('{letter} {count}'.format(letter=letter
count=count))
```
Assim, os parâmetros usados na saída são nomeados e podem ser alterados.
Isto gera um erro estranho quando um nome usado na string de formato não
for passada na lista de parâmetros do format, mas em strings mais
complexas, o resultado é mais fácil de ser entendido (por exemplo, eu acho
mais fácil entender `{letters} aparece {count} vezes` do que `{} aparece {}
vezes`; ainda, é possível mudar a ordem das variáveis na string de formato
sem precisar alterar a ordem na lista de parâmetros).
Uma opção melhor ainda é:
```python
import six
six.print_('{letter} {count}'.format(letter=letter,
count=count))
```
Com Six, remove-se a dependência com `__future__` e assim pode-se usar o
mesmo código em Python 2.6.
> `for letter, count in word.letters.iteritems():`
```python
import six
for letter, count in six.iteritems(word.letters):
```
Six provê uma interface unificada para iterador de itens tanto em Python 2
quanto Python 3: `six.iteritems()` irá chamada `iteritems()` se estiver
rodando em Python e `items()` se estiver rodando com Python 3.
E, assim, nosso código relativamente idiota agora é compatível com Python 2 e
Python 3 roda de forma idêntica nos dois.
Mas vamos para um exemplo real:
```python
import urllib
import urlparse
def add_querystring(url, querystring, value):
frags = list(urlparse.urlsplit(url))
query = frags[3]
query_frags = urlparse.parse_qsl(query)
query_frags.append((querystring, value))
frags[3] = urllib.urlencode(query_frags)
return urlparse.urlunsplit(frags)
if __name__ == "__main__":
print add_querystring('http://python.org', 'doc', 'urllib')
print add_querystring('http://python.org?doc=urllib',
'page', '2')
```
{% note() %}
Sim, sim, o código poderia ser um simples "verificar se tem uma interrogação na
URL; se tiver, adicionar `&` e a query string; se não tiver, adicionar `?` e a
query string". A questão é: dessa forma, eu consigo fazer uma solução que vai
aceitar qualquer URL, em qualquer formato, com qualquer coisa no meio porque as
bibliotecas do STL do Python vão me garantir que a mesma vai ser parseada
corretamente.
{% end %}
Esse é um código de uma função utilizada para adicionar uma query string em
uma URL. O problema com essa função é que tanto `urlib`
quanto `urlparse` sofreram grandes modificações, ficando, inclusive, sob o
mesmo módulo (agora é tudo `urllib.parse`).
Para fazer esse código ficar compatível com Python 2 e 3 ao mesmo tempo, é
preciso usar o módulo `six.moves`, que contém todas essas mudanças de escopo
das bibliotecas da STL (incluindo, nesse caso, a `urllib` e `urlparse`).
```python
import six
def add_querystring(url, querystring, value):
frags = list(six.moves.urllib.parse.urlsplit(url))
query = frags[3]
query_frags = six.moves.urllib.parse.parse_qsl(query)
query_frags.append((querystring, value))
frags[3] = six.moves.urllib.parse.urlencode(query_frags)
return six.moves.urllib.parse.urlunsplit(frags)
if __name__ == "__main__":
six.print_(add_querystring('http://python.org', 'doc', 'urllib'))
six.print_(add_querystring('http://python.org?doc=urllib',
'page', '2'))
```
O que foi feito, aqui, foi usar `six.moves.urllib.parse`. Essa estrutura não
vêm por acaso: no Python 3, as funções de `urlparse` agora se encontram em
`urllib.parse`; Six assumiu que a localização correta para as funções dentro
"de si mesma" seriam os pacotes utilizados no Python 3.
E, assim, temos dois exemplos de programas que conseguem rodar de forma igual
tanto em Python 3 quanto Python 2.
Ainda, fica a dica: Se houver algum software que você utiliza que não roda
corretamente com Python 3, utilizar o Six pode ajudar a manter o código atual
até que uma escrita resolva o problema.
## Outras Perguntas
### Como fica a questão de ficar sempre com o Six?
Boa parte das aplicações hoje botaram uma "quebra" do suporte às suas versões
que rodam em Python 2. Por exemplo, Django anunciou que em 2020 vai sair a
versão 2.0 do framework e essa versão vai suportar Python 3 apenas.
## Quão difícil é portar para Python 3?
Não muito difícil -- agora. Muitas das coisas que foram removidas que davam dor
de cabeça na conversão retornaram; o caso mais clássico é o que operador de
interpolação de strings `%`, que foi removido e teria que ser substituído por
`str.format`, mas acabou retornando. Outro motivo é que os scripts são mais
"pythônicos" atualmente, muito por causa de gente como [Raymond
Hettinger](https://rhettinger.wordpress.com/), que tem feito vídeos excelentes
de como escrever código em Python com Python (ou seja, código "pythônico"). E,
como anedota pessoal, eu posso comentar que meu código de 2003 rodou com
`python -3` sem levantar nenhum warning.

102
content/the-day-i-found-my-old-code.md

@ -0,0 +1,102 @@
+++
title = "The Day I Found My Old Code"
date = 2015-12-18
category = "code"
[taxonomies]
tags = ["code", "python", "pep8", "pylint", "en-au"]
+++
Found a piece of code I wrote 2 years ago, following a lot of linters. I'm
amazed how readable the code still is.
<!-- more -->
Today, walking across a client repository, I found a module I wrote two years
ago in Python. At the time, we lacked the knowledge to write proper tests, but
we used a lot of other tools: PEP8 and Pylint, mostly.
Today-me is pissed with two-years-ago-me for the lack of tests, but where my
memory forgot the nuances of the project, the huge amount of comments and
proper documentation makes it for.
For example, every pylint disable have an explanation about why it was
disabled:
```python
# flask has a weird way to deal with extensions, which work fine but confuses
# the hell out of PyLint.
```
Related modules are loaded in sequence, with line breaks between different
sources:
```python
from flask.ext.babel import Babel
from flask.ext.babel import refresh
from flask.ext.gravatar import Gravatar
from werkzeug.routing import NotFound
from werkzeug.routing import RequestRedirect
```
Every variable, every function, is documented in proper Sphinx format, which
contributes to understanding what the variable/function do:
```python
#: Session duration time
#: The time is given as number and a time interval ("m" for minutes, "h" for
#: hours, "d" for days and "w" for weeks), e.g., "3d". A value of "None" will
#: make the session last till the user closes the browser.
SESSION_EXPIRATION = "1d"
```
```python
def reroute(route):
"""Route control. The route must exist in the known routes list to return
a valid result; unknown routes will be redirected to the 404 page; if the
route exists but it's marked as "maintenance", the request will be
redirected to the 503 page."""
```
Also, I found a class with a docstring of about 20 lines. It explains every
single parameter in its `__init__` function, which makes perfect sense when
you generate the documentation.
Where the functions lacked a good name (due having a good name inside their
own objects/modules), a comment was added to explain what the function was
actually doing:
```python
inject(current_app) # inject values if run stand-alone modules
load_routes(current_app) # load the routing information
register_filters() # register jinja filters
register_functions() # register jinja functions
register_tests() # register jinja tests
set_session_time() # define the cookie time
```
Also, I had the slight habit of putting large comments in the code when
something was kinda hacky:
```python
# Now you're asking yourself: "Why heuristic find?" The reason is
# simple: in _function() , we add a new endpoint on top of one
# existing endpoint; because we do that on top of anything, we don't
# know, for sure, which one of the parameters the user (the other
# programmer, in this case) used in their URLs. So we need to through
# all parameters they expect to receive in their detail function in
# the hopes of finding something that actually matches a "pk".
```
It doesn't make much sense here, but believe me, it works. I was just reading
the code with a function called `heuristic_find` and I was "Man, which drugs I
took to call it 'heuristic_find'?" And BOOM, there it was why it was called
like that.
Ok, honesty time: I wasn't the only one writing this code. But thanks to the
client input, I started and enforced all those rules (and wrote a huge part of
the base code), the code is still readable two years later.
Yeah, I'm proud of it.

43
content/the-sad-life-of-walter-mitty.md

@ -0,0 +1,43 @@
+++
title = "The Sad Life of Walter Mitty"
date = 2015-03-28
category = "thoughts"
[taxonomies]
tags = ["movies", "the secret life of walter mitty", "rethink", "review", "en-au"]
+++
I once wrote about [The Secret Life of Walter
Mitty](http://juliobiason.net/2014/11/13/the-secret-life-of-walter-mitty-2013/)
and how nice story about a guy outgrowing his daydreams.
<!-- more -->
But today I realized I see everything wrong.
The second time I watched the movie, in the scene Walter talks to Todd (from
E-Harmony) on the top of the Himalayas, I thought "Well, that's one hell of a
mobile company, they have signal on the top of Himalayas".
The third time I realized *how* the signal was that good: Walter never went to
the Himalayas.
Let's assume that, in the story, Walter really went to Greenland and Iceland
and came back. And then he got fired by losing the negative #25. This is where
I believe everybody is tricked. At that point, Walter actually lost his only
connection to the real life (his job) and descend into a full time illusion.
That's why the recluse Mitty went to Afghanistan and had to give cake to guys
with guns. That's how his call is crystal clear on the top of Himalayas. That's
why Tood, who never really knew Walter, went to the airport to rescue him.
That's how his E-Harmony profile suddenly was the hottest profile ever. That's
why the piano check was so large, so he wouldn't need to worry about his
unemployed life. That's why his mom saved the wallet. This how he finally
manages to face Ted. That's why Cheryl is right there when he gets his
severance check. And that's how his damn face appears on the cover of Life.
That even explains why Sean never took the picture -- if there is no picture,
there is nothing to show that the whole thing was a dream.
When you pick the "he's in complete disconnection with reality and he lives in
his imagination now", the whole ending stops being a succession of lucky
happenings and starts to make sense. A sad sense, but a sense, nonetheless.

66
content/when-i-used-pep8-to-fuck-up-code.md

@ -0,0 +1,66 @@
+++
title = "When I Used PEP8 To Fuck Up Code"
date = 2016-07-19
category = "code"
[taxonomies]
tags = ["python", "pep8", "readability", "en-au"]
+++
We "inherited" some Python code recently. Although another team was working on
it, we now should support it and keep it going. The previous team at least
tried to use Pylint and follow PEP8. And I say "tried" because their
`pylintrc` has a couple of exceptions and their PEP8 extended the
maximum column to 100.
<!-- more -->
{% note () %}
Pylint exceptions are almost common case these days, specially in
a Django project. But plain, pure `pylintrc` exclusion without giving any
pointers on *why* you're adding that exception are dumb, IMHO. I had a
project were we decided to add pylint exceptions inside the code, but for
every exception there should be a comment preceeding it explaining the
reason for the exception ("the framework doesn't expose this directly",
"pylint can't see this variable, but it is there", "It's the common place
to name the variable this way" and so on).
{% end %}
Quick pause here 'cause I know a bunch of people will complain with a "But
monitors these days are very large and you don't need to focus on column 80;
we don't use CGA anymore, old person!". The thing about the maximum column at
80 is *not* about "being visible on every CGA" but actually a measure of
readability: If you speak shorter, concise sentences, people will get the idea
quickly; if you keep an stream of words non-stop without reaching a conclusion
and without any punctuation to keep the ideas flowing, you will end up with
something that it is easier to forget and which the central idea will be lost
(and I freaking hope you got what I just did). It's tiring to read a very long
sentence; it's easy to keep the context on a short sentence.
In the spirit of "proper" PEP8, I reformatted one of the failing tests
to follow the 80 column limit. And now the code looks like crap. And
I'll commit like that. It's not because I hate my coworkers, but to point out
that, because it's a pain to read, it means the structured of the code is too
complex. If someone comes and say "damn, this test is hard to read", I'll be
able to point that it is not the test that it is hard to read, but the code
that reached the point where its complexity is leaking to the test code; it is
now a good time to refactor this to simplify things and make them easier to
read.
{% note() %}
Actually, the reason for it to fail is too damn fun and worth a proper blog
post about it. Stay tunned!
{% end %}
Not that we can simply stop working and fix the damn architecture of it, but we
can at least keep this beast around till everybody gets pissed and realize it
*desperately needs* a refactor.
{% note() %}
Weird thing, people usually assume some countries are the center of bad code;
this baseline is coming from a "first world country" and, heck, it has one of
the worst designs I ever saw. I'll not name names here to protected the (maybe)
innocent. But in the second week of training, I realized this whole project
has, at least, 6 months of technical debt already.
{% end %}
Loading…
Cancel
Save