Julio Biason
6 years ago
11 changed files with 1086 additions and 0 deletions
After Width: | Height: | Size: 33 KiB |
@ -0,0 +1,238 @@
|
||||
+++ |
||||
title = "Couchbase Example and REST" |
||||
date = 2016-01-12 |
||||
category = "code" |
||||
|
||||
[taxonomies] |
||||
tags = ["rest", "couchbase", "example", "restful", "en-au"] |
||||
+++ |
||||
|
||||
Using the example Couchbase to show how REST works. |
||||
|
||||
<!-- more --> |
||||
|
||||
Let me start this by pointing that I'm a RESTnazi: I'm the kind of guy that |
||||
will get into a fight with anyone that says things like "Ok, that's because |
||||
this is just REST, not RESTful" because... well, because, there is no |
||||
diference between REST and RESTful. |
||||
|
||||
And today I found something weird while reading |
||||
[the Couchbase documentation](http://developer.couchbase.com/documentation/server/4.1/travel-app/travel-app-walkthough.html) |
||||
with them claiming that their example is REST while... well, it isn't. |
||||
|
||||
But hey, that's a good opportunity to explain a bit what is REST (and what is |
||||
not). |
||||
|
||||
## What is REST? |
||||
|
||||
REST is an architecture/design pattern/pick your buzzword built on top of HTTP |
||||
to provide information. It has two components: |
||||
|
||||
* **Resources**: That's the elements in your system: Your users, your books, |
||||
your airports, your flights and such. |
||||
|
||||
* **Verbs**: Those are the things you do with your resources: You GET them, |
||||
you UPDATE them, and so on. |
||||
|
||||
There is no true "guideline" on how to write resources. It's usually done with |
||||
nouns in their plural form (or, at least, that's what [Apigee](http://apigee.com/about/) |
||||
concluded after checking a bunch of APIs around). Those resources are mapped |
||||
through URLs with some base. |
||||
|
||||
Let's pick the example from Couchbase: It's a travel app, with airports, |
||||
flights and flight paths. We could use a base URI scheme of |
||||
`/travel/api/v1.0/` because: |
||||
|
||||
1. The travel app could also provide a user interface through `/travel/`, so |
||||
we keep the API endpoint on `api` to not mix things. |
||||
|
||||
2. We are versioning the API (here, v1.0). This is a recommendation from |
||||
Apigee and, again, not part of the architecture/design patter/buzzword. |
||||
|
||||
On the top of this base URI, we'll build our resource URLs: |
||||
|
||||
* `/travel/api/v1.0/airports/` and |
||||
* `/travel/api/v1.0/flights/` |
||||
|
||||
"Where is the flight path endpoint?", you must be asking? Well, I'll tell you |
||||
later about it, hold on a second, but we'll use those two to explain the very |
||||
basic of REST first, ok? |
||||
|
||||
Besides those two URIs, we need two more: One for each resource to access |
||||
direct elements. So now we have: |
||||
|
||||
* `/travel/api/v1.0/airports/`; |
||||
* `/travel/api/v1.0/airports/{airport_id}`; |
||||
* `/travel/api/v1.0/flights/` and finally |
||||
* `/travel/api/v1.0/flights/{flight_id}`. |
||||
|
||||
So, now that we have our resources, we need ways to manage their contents. For |
||||
this, we use the "verbs" I mentioned before. The thing about rest is that |
||||
those actions are directly tied to the HTTP verbs: |
||||
|
||||
* **GET** will retrieve elements in the resource; |
||||
* **POST** will insert a new element in the resource; |
||||
* **PUT** is used to update the information of an element [#put]_; |
||||
* **DELETE** is used to remove an element from the resource [#delete]_. |
||||
|
||||
{% note () %} |
||||
If you want an easy mnemonic, "PUT" has and "U", for "update". Yes, it's silly, |
||||
but it works (at least, for me). Also, a "PUT" directly on a resource means |
||||
"replace the whole database with this information" and, thus, not not really |
||||
widespread. |
||||
{% end %} |
||||
|
||||
{% note() %} |
||||
You can add a DELETE for your whole resource, if you're crazy and bold enough. |
||||
{% end %} |
||||
|
||||
And adding those two we have: |
||||
|
||||
* Get a list of all airports: `GET /travel/api/v1.0/airports/` |
||||
* Add a new airport: `POST /travel/api/v1.0/airports/` |
||||
* Get information of a single airport: `GET /travel/api/v1.0/airport_3577` |
||||
* Update the information of an airport: `PUT /travel/api/v1.0/airport_3577` |
||||
|
||||
... and so on. |
||||
|
||||
Easy as pie, right? |
||||
|
||||
## The "Flight Path" resource |
||||
|
||||
Now let's go back to the "flight path" resource, which I left behind. Thing |
||||
is, a flight path does not exist on its own. If a flight doesn't exist, the |
||||
flight path doesn't exist either, right? And if I flight exist, it should have |
||||
a path, right? |
||||
|
||||
So a flight path is a resource linked directly to our resource of flights. For |
||||
this, REST allows resource chaining by just adding another layer on top of |
||||
existing URIs. As we pointed before, a flight path **needs** a flight (a |
||||
flight *element*, just to make more clear where I'm going for with this), so |
||||
we should build the resource on top of an element URI: |
||||
|
||||
* `/travel/api/v1.0/flights/airline_24/paths` and |
||||
* `/travel/api/v1.0/flights/airline_24/paths/{path_id}` |
||||
|
||||
... although the last one only makes sense if a flight could have two (or |
||||
more) different paths, which would make sense if it goes one way in a path and |
||||
goes back in a different path, which I do not know enough about flights to |
||||
know if this is possible, but for the sake of explaining everything about |
||||
REST, let's go with it, mkay? |
||||
|
||||
And now you may be wondering: Why not simply do |
||||
`/travel/api/v1.0/flightpaths/{path_id}`? Again, because flight paths are tied |
||||
to flights, this means the base resource for the flight won't even exist and, |
||||
thus, it's sub-resources won't be available, which makes a lot of sense. |
||||
|
||||
## Filtering results |
||||
|
||||
Ok, now we know how to retrieve all airports, which is nice, but we don't want |
||||
them all: the user will type something and we'll show them only the airports |
||||
that match their search. We could screw the user and send the whole list to |
||||
them and let the application filter it locally, abusing the user bandwidth and |
||||
CPU power -- which isn't nice, since we have a database on our side that can |
||||
do this filtering faster. |
||||
|
||||
Because we can use URIs only to point to resources and resource elements, we |
||||
need a different way of passing this to the server. And guess what? HTTP have |
||||
the proper way to do this: querystrings and forms. |
||||
|
||||
Querystrings, for those unfamiliar with HTTP, are the things can come after |
||||
the "?" in the URL. For example, in the URL: |
||||
"`http://example.com/sayname?name=julio`", "`name=julio`" is the querystring. |
||||
It provides a key ("name") and a value ("julio"). Forms are basically the |
||||
same, but instead of being part of the URL, they are sent in the body of the |
||||
HTTP request (and can be much, much larger than querystrings). |
||||
|
||||
There is one more thing about querystrings and forms: The only way to send |
||||
information to the server in a `GET` request is through querystrings, since |
||||
GETs do not have a body. DELETEs can have a body, but the RFC says it should |
||||
be ignored. POST and PUT do have bodies and, thus, information about the |
||||
element to be added/updated should come in there. |
||||
|
||||
So, for filtering, we could have a "filter" querystring to filter elements. |
||||
Couchbase filters airports with a single querystring, so we could simply do |
||||
|
||||
`GET /travel/api/v1.0/airports/?filter=<user input>` |
||||
|
||||
So the user will see a bunch of airports with their input. And, since we have |
||||
all the airport, we could also link the flights as a subresource of it, with: |
||||
|
||||
`GET /travel/api/v1.0/airports/<airport_id>/flights/` |
||||
|
||||
... which we didn't mention before, but it is now making sense, right? |
||||
|
||||
Couchbase example also allows showing which flights connect two airports and |
||||
the REST way is, again, using querystrings: |
||||
|
||||
`GET /travel/api/v1.0/airports/<airport_id>/flights/?connectedTo=<airport_id>` |
||||
|
||||
And, if you want to be nice enough, you could even add a "fields" parameter, |
||||
so your API consumers could filter out fields they don't want in the results, |
||||
to reduce the bandwidth required. But it's all up to you. |
||||
|
||||
Weird how things make absolute sense here, and we never called the "flights" |
||||
resource, right? That's one of the things about REST: you build resources in a |
||||
way that make sense for the **consumer** of the API, not to reflect your |
||||
database. |
||||
|
||||
## Pagination |
||||
|
||||
Just for the sake of completeness, let's talk a bit about pagination. |
||||
|
||||
Pagination, in REST, works for getting all the elements in the resource, so |
||||
it's used in the GET request for the resource. And, because it's part of the |
||||
GET request, it should come in the querystring. |
||||
|
||||
There are a couple of ways of doing pagination, in this case: |
||||
|
||||
* Let the consumer specify page size and page count: In this case, you could |
||||
have a query string like `count=15&page=2` to retrieve the elements from the |
||||
second page of 15 elements each. This is the most common way of doing |
||||
pagination and Twitter is one good example of this. |
||||
|
||||
* Have a hardcoded pagesize: Same as before, but the only option available is |
||||
`page=2`. |
||||
|
||||
* Have the consumer specify the last seen element and page size. So the first |
||||
request would have something like `count=15` to retrieve the first 15 |
||||
elements, but the next request would have the last element in the list as a |
||||
parameter, like `count=15&lastSeen=16` and the server would return all |
||||
elements that come after the element with id "16". This prevents duplication |
||||
in the results in case a new element is added. Reddit uses this in their |
||||
API. |
||||
|
||||
## The type of response |
||||
|
||||
Again, for the sake of completeness, you may have noticed that not even once I |
||||
mentioned the type of data to be returned in each step. That's because REST |
||||
does not have a format: You could build a whole service that returns HTML |
||||
pages in REST format, and that's ok; you could return JSON, which the |
||||
Couchbase documentation points correctly that it is the most widely used |
||||
format; you could return XML; if you're crazy enough and want to return in |
||||
COBOL format, go for it! |
||||
|
||||
## So, where the example fails to be REST? |
||||
|
||||
1. All paths are marked with "findAll". "findAll" is **not** a resource and, |
||||
thus, shouldn't be in the URL. |
||||
|
||||
2. As I pointed, flight paths are actually a sub-resource of flights and |
||||
should be linked. Flight paths should **not** exist if the flight doesn't |
||||
exist. |
||||
|
||||
The flight path query uses querystrings to retrieve the information for paths |
||||
that go through two airports, which is the right way of doing, but again, it |
||||
shouldn't be a resource on itself. |
||||
|
||||
## How to fix the documenation |
||||
|
||||
Easy way? Remove the "REST" mention in the pages. I *am* nitpicking the word |
||||
"REST" there, I fully reckonize it, and I understand that for the sake of |
||||
example it doesn't have to be REST, but it seems wrong to tell people |
||||
something is REST when it isn't. |
||||
|
||||
If Oracle decided to say "we added a field type that can store huge amounts |
||||
of JSON data, and although you can't query its content, we can now say |
||||
OracleDB is a NoSQL database", people would lose their minds. But that's kinda |
||||
like I'm feeling about this whole thing. |
@ -0,0 +1,53 @@
|
||||
+++ |
||||
title = "Dear Github Maintainers" |
||||
date = 2016-01-15 |
||||
categories = "code" |
||||
|
||||
[taxonomies] |
||||
tags = ["github", "comments", "en-au"] |
||||
+++ |
||||
|
||||
A rebuttal to "Dear Github". |
||||
|
||||
<!-- more --> |
||||
|
||||
So recently in Reddit, there is this thread going around about |
||||
[Dear Github](https://github.com/dear-github/dear-github), |
||||
which points some problems with Github issues pages. |
||||
|
||||
Thing is, most of the problems are not problems with Github itself, but by the |
||||
community that grew around it. |
||||
|
||||
For example, the most annoying one is the huge amount of "+1" in comments. I've |
||||
seen this and yes, it's annoying as hell. Lots of people come around and post |
||||
a simple "+1" instead of really contributing. This is *not* an issue with |
||||
Github, it is an issue with the community that instead of helping fixing a |
||||
problem, thinks that posting "+1" to point that it is important to them is |
||||
actual help. It isn't. I've seen issues with so many "+1" that if everyone |
||||
who posted a "+1" actually submitted a single change, the bug would be fixed |
||||
with spare lines. |
||||
|
||||
(Unpopular opinion: Github should have support for "+1", but actually *ban* it. |
||||
It is unhelpful. If it's important to you, you should at least give a try to |
||||
fix the issue instead of "+1" and giving yourself a pat in the back for |
||||
"helping out".) |
||||
|
||||
Issues missing important information surely is a problem, but that's why you |
||||
need to triage your issues. Is there any missing information? You can reply to |
||||
the poster. "But why should I ask when I can put a form for the user to fill |
||||
issues?" Dude, seriously? You're worried that you will lose 30 seconds of your |
||||
life to ask something? Why don't you want to talk to your community, why you |
||||
don't want to teach people how to properly report errors? Is it that hard to |
||||
be part of a community? |
||||
|
||||
But the hurting point is the "if Github was open source, we would fix this |
||||
ourselves". [Gitorious](https://en.wikipedia.org/wiki/Gitorious) was open |
||||
source and never had that much contribution from the community, to the point |
||||
it was closed and moved to Gitlab. So I have to ask: If Bitbucket implemented |
||||
this, would all of you move to it? My guess is an indignant "No", because |
||||
Github means exposure while all the other public Git sites are not. |
||||
|
||||
To me, the whole list is not a list of problems with Github itself, but a |
||||
problem with the open source (in the general, broad term) community that's |
||||
growing around Github. We should worry about building communities, not building |
||||
code with 400 forks, 1000s of "+1" comments and a single maintainer. |
@ -0,0 +1,39 @@
|
||||
+++ |
||||
title = "Announcing JulioBiason.Net 3.0" |
||||
date = 2015-02-18 |
||||
category = "announcements" |
||||
|
||||
[taxonomies] |
||||
tags = ["meta", "blog", "pelican", "en-au"] |
||||
+++ |
||||
|
||||
Short version: New blog URL, engine and layout. |
||||
|
||||
<!-- more --> |
||||
|
||||
Long version: For a long time already, I've been thinking about using a static |
||||
blog generator. Not that there is anything wrong with dynamic blog engines (and |
||||
I'm a long time [WordPress](https://wordpress.org/) user, without any issues, |
||||
specially since my hosting company -- [Dreamhost](http://www.dreamhost.com/) -- |
||||
offers easy updates), but... I don't know, I think it's easy to automate some |
||||
stuff when all you have are basic files, with no API to talk to. |
||||
|
||||
So, here it is. A new blog URL, so all old posts are still visible in their |
||||
original paths (although this will be a problem in the future when I decide to |
||||
launch a 4.0 blog, but that's a problem for the future); a new engine, as |
||||
WordPress is not static, so I decided to go with |
||||
[Pelican](http://blog.getpelican.com/), simply because I know Python (I know |
||||
there is a huge community for [Jekyll](http://jekyllrb.com/), but I'm not a |
||||
Ruby guy and I don't want to be a Ruby guy); and finally a new layout, as I |
||||
took everything I've been playing with [Zurb |
||||
Foundation](http://foundation.zurb.com/) and, since I'd automagically gain a |
||||
responsive layout, I did just that. And yes, the |
||||
[theme](https://bitbucket.org/juliobiason/pelican-fancy-foundation) is my |
||||
creation -- and that's why there is a bunch of broken stuff. I'll be fixing |
||||
them in the future, as I see them -- or someone reports them to me. |
||||
|
||||
PS: There is actually a hidden thing, some [things I don't want to deal |
||||
again](http://juliobiason.net/2008/02/23/why-half-life-2-failed/), which could |
||||
probably crippling me in what to write (hence why the content was so dull and |
||||
boring in the last few months). Because static blogs don't have comments, I may |
||||
feel fine in finally discussing them. |
@ -0,0 +1,114 @@
|
||||
+++ |
||||
title = "Mocking A Mock" |
||||
date = 2016-07-21 |
||||
category = "code" |
||||
|
||||
[taxonomies] |
||||
tags = ["python", "mock", "mongodb", "find", "count", "en-au"] |
||||
+++ |
||||
|
||||
Mocks are an important part of testing, but learn how to properly mock stuff. |
||||
|
||||
<!-- more --> |
||||
|
||||
A few weeks ago we had a test failing. Now, tests failing is not something |
||||
worth a blog post, but the solution -- and the reason it was failing -- is. |
||||
|
||||
A few background information first: The test is part of our Django project; |
||||
this project stores part of the information on MongoDB, because the data is |
||||
schemaless -- it comes from different sources and each source has its own |
||||
format. Because MongoDB is external to our project, it had to be mocked |
||||
(sidenote: mocks are there exactly to do this: the avoid having to manage |
||||
something external to your project). |
||||
|
||||
PyMongo, the MongoDB driver for Python, has a `find()` function, pretty much |
||||
like the MongoDB API; this function returns a list (or iterator, I guess) with |
||||
all the result records in the collection. Because it is a list (iterator, |
||||
whatever), it has a `count()` function that returns the number of records. So |
||||
you have something like this: |
||||
|
||||
```mongodb |
||||
connector.collection.find({'field': 'value'}).count() |
||||
``` |
||||
|
||||
(Find everything which has a field named "field" that has a value of "value" |
||||
and count the results. Pretty simple, right?) |
||||
|
||||
The second hand of information you need is about the `mock` module. Python 3 |
||||
has a module for mocking external resources, which is also available to Python 2. |
||||
The interface is the same, so you can |
||||
[refer to the Python 3 documentation](https://docs.python.org/dev/library/unittest.mock.html) |
||||
for both versions. |
||||
|
||||
An usage example would be something like this: If I had a function like: |
||||
|
||||
```python |
||||
def request(): |
||||
return connector.collection.find({'field': 'value'}) |
||||
``` |
||||
|
||||
and I want to test it, I could this: |
||||
|
||||
```python |
||||
class TestRequest(unittest.TestCase): |
||||
@patch("MyModule.connector.collection.find") |
||||
def test_request(self, mocked_find): |
||||
mocked_find.return_value = [{'field': 'value', 'record': 1}, |
||||
{'field': 'value', 'record': 2}] |
||||
result = request() |
||||
self.assertDictEqual(result, mocked_find.return_value) |
||||
``` |
||||
|
||||
Kinda sketchy for a test, but I just want to use to explain what is going on: |
||||
the `@patch` decorator is creating a stub for any call for |
||||
`MyModule.connector.collection.find`; inside the test itself, the stub is |
||||
being converted to a mock by setting a `return_value`; when the test is run, |
||||
the mock library will intercept a call to the `collection.find` inside |
||||
`MyModule.connector` (because that module imported PyMongo driver to its |
||||
namespace as `connector`) and return the `return_value` instead. |
||||
|
||||
Simple when someone explains like this, right? Well, at least I hope you got |
||||
the basics of this mocked stuff. |
||||
|
||||
Now, what if you had to count the number of results? It's pretty damn easy to |
||||
realize how to do so: just call `count()` on the resulting list, or make it |
||||
return an object that has a `count()` property. |
||||
|
||||
The whole problem we had was that the result of `find()` was irrelevant and |
||||
all we wanted was the count. Something like |
||||
|
||||
```python |
||||
def has_values(): |
||||
elements = connector.collection.find({'field': 'value'}).count() |
||||
return elements > 1 |
||||
``` |
||||
|
||||
First of all, you can't patch `MyModule.connector.collection.find.count` |
||||
because you'll only stub the `count` call, not `find`, which will actually try |
||||
to connect on MongoDB; so the original patch is required. And you can't patch |
||||
both `find` and `count` because the first patch will return a new `MagicMock` |
||||
object, which will not be patched (after all, it is *another* object). The |
||||
original developer tried to fix it this way: |
||||
|
||||
```python |
||||
mocked_find.count.return_value = 0 |
||||
``` |
||||
|
||||
... which, again, doesn't work because the call to `find()` will return a |
||||
`MagicMock` that doesn't have its `count` patched. But the developer never |
||||
realized that because `MagicMock` tries its best to *not* blow up your tests, |
||||
including having return values to conversions like... int. And it will always |
||||
return 1. |
||||
|
||||
Is your head spinning yet? Mine sure did when I realized the whole mess it was |
||||
being made. And let me repeat this: The problem was *not* that MongoDB was |
||||
being mocked, but that it was being *mocked in the wrong way*. |
||||
|
||||
The solution? As pointed above, make `find` return an object with a `count` |
||||
method. |
||||
|
||||
```python |
||||
count_mock = MagicMock(return_value=0) |
||||
mocked_find.return_value = MagicMock( |
||||
**{'count': count_mock}) |
||||
``` |
@ -0,0 +1,60 @@
|
||||
+++ |
||||
title = "Pre-Orders: The Case of No Man's Sky" |
||||
date = 2016-08-25 |
||||
category = "thoughts" |
||||
|
||||
[taxonomies] |
||||
tags = ["pre-order", "grim dawn", "no man's sky", "en-au"] |
||||
+++ |
||||
|
||||
[No Man's Sky](http://www.no-mans-sky.com/) is getting a lot of heat recently |
||||
because, well, the game is not all what the developers promised. And a lot of |
||||
people are putting the blame on pre-orders and whatnot. |
||||
|
||||
<!-- more --> |
||||
|
||||
Thing is, this is not a problem with pre-orders. This is a problem with a |
||||
development company not getting up with the times. |
||||
|
||||
Example: [Grim Dawn](http://www.grimdawn.com/). Although not a pre-order thing |
||||
per-se, the game was on a Kickstart. Today, the game is polished, fun, have |
||||
lots of stuff to do but nowhere there is someone claiming this "pre-order" |
||||
thing ruined the game. |
||||
|
||||
The difference between Grim Dawn and No Man's Sky is that Crate, the |
||||
developers of the first, continuously delivered versions to get feedback. |
||||
Falling through the world? Ok, we can fix. Game doesn't run on your rig even |
||||
when you have the minimal specs? There is something wrong with our engine. |
||||
That feature? Yeah, it's too big for now, we'll work on it later. |
||||
|
||||
Not on this list, but ArenaNet did something close to that with Guild Wars 2: |
||||
People who pre-purchased the game -- an "extreme" version of pre-order -- |
||||
could participate on the closed beta events. Those events, although not |
||||
spawning through whole maps, would allow players to experience some part of |
||||
the game and return feedback. They would even claim "we just want to stress |
||||
the servers, so weird things could happen" and people were fine with that. |
||||
|
||||
Hello Games, on the other hand, did all development behind closed doors. Sure |
||||
they are a small company, but there was nothing stoping them from actually |
||||
doing some open beta test or whatever to receive feedback. Well, except on |
||||
thing: Sony. |
||||
|
||||
Sony injected money on Hello Games for their first title (I was about to claim |
||||
"a lot of money" but heck if I know how much they funded) and wanted it on |
||||
their console. Now, consoles do not have a "here, play for testing" or "signup |
||||
for this and we'll add your console ID in our database and you can download |
||||
the game". To keep the things hyped, no one could see the game before release. |
||||
No previews, no betas, no nothing. Feature wasn't fun? Who would know, it's |
||||
scrapped now. The engine blows up on certain configurations? Only way to check |
||||
this is after the final release. |
||||
|
||||
So, again, it's not a problem with "Pre-orders are bad and you should feel |
||||
bad". This is a problem with a company not keeping up with the times. A lot of |
||||
companies are now sharing things beforehand to get larger feedback than their |
||||
friends and family: Microsoft is continuously releasing Windows versions |
||||
through their "Microsoft Insider" program, which anyone can join; Apple is |
||||
giving betas of all their OSes for anyone who wants to test them. The idea of |
||||
"many eyes makes all bugs shallow" finally caught up and people realised it |
||||
was right. |
||||
|
||||
But, apparently, Hello Games + Sony didn't. |
@ -0,0 +1,312 @@
|
||||
+++ |
||||
title = "Python 2 + 3 = Six" |
||||
date = 2016-11-21 |
||||
|
||||
category = "code" |
||||
|
||||
[taxonomies] |
||||
tags = ["python", "six", "python 2", "python 3", "tchelinux", "pt-br"] |
||||
+++ |
||||
|
||||
"Six" é uma pequena biblioteca Python que pode ajudar você a passar o seu |
||||
código de Python 2 para Python 3. |
||||
|
||||
<!-- more --> |
||||
|
||||
{% note() %} |
||||
(Esse post é relacionado com a apresentação que eu fiz no dia 19 de novembro |
||||
no TchêLinux. Os slides podem ser encontrados |
||||
[na área de apresentações](http://presentations.juliobiason.net/python23six.html).) |
||||
{% end %} |
||||
|
||||
|
||||
Antes de mais nada, uma coisa que precisamos responder é: Porque alguém usaria |
||||
Python 3? |
||||
|
||||
* Todas as strings são unicode por padrão; isso resolve a pilha de problemas |
||||
macabros, chatos, malditos, desgraçádos do `UnicodeDecodeError`; |
||||
* `Mock` é uma classe padrão do Python; ainda é possível instalar usando `pip` e |
||||
a sintaxe é exatamente igual, mas é uma dependência a menos; |
||||
* `Enum` é uma classe padrão do Python; Enum é um dos abusos mais |
||||
interessantes de classes em Python e realmente útil; |
||||
* AsyncIO e toda a parte de lazy-evaluation que o Python 3 trouxe; muita coisa |
||||
no Python 3 deixou de ser "gerar uma lista" para ser um retorno de um |
||||
iterador ou um generator; com AsyncIO, tem-se um passo a frente nessa idéia |
||||
de geração lazy das coisas e, segundo pessoas mais inteligentes que eu, com |
||||
PyUV, o Python consegue ser tão ou mais rápido que o Node; |
||||
* E, principalmente, **o suporte ao Python 2 termina em 2020!** |
||||
|
||||
{% note() %} |
||||
Existe ainda a interpolação de strings com o novo identificador `f`; a |
||||
funcionalidade é semelhante à chamada `str.format` usando `locals()`, por |
||||
exemplo, `f'{element} {count}` é equivalmente à `'{element} |
||||
{count}'.format(locals())` (desde que você tenha `element` e `count` como |
||||
variáveis locais da sua função). |
||||
{% end %} |
||||
|
||||
O último ponto é o mais importante. Você pode pensar "mas ainda tem três anos |
||||
até lá", mas natal está chegando, daqui a pouco é carnaval e, quando menos se |
||||
espera, é 2020. |
||||
|
||||
## O caminho para Python 3 |
||||
|
||||
Quem quiser já começar a portar seus aplicativos para Python 3, existem duas |
||||
formas: |
||||
|
||||
A primeira é executar seus aplicativos com `python -3 [script]`; isso irá fazer |
||||
com que o interpretador Python avise quando qualquer instrução de código que |
||||
ele não consiga converter corretamente seja alertado. Eu executei um script |
||||
pessoal [com data de 2003](https://bitbucket.org/juliobiason/pyttracker) e o |
||||
Python não apresentou nada. |
||||
|
||||
{% note() %} |
||||
Apenas para fins de melhor elucidação: o código que eu estava gerando já estava |
||||
mais correto e seguindo os padrões mais pythônicos; em 2014 eu ainda estava |
||||
vendo casos em que código rodando em Python 2.6 ainda usava `has_keys()`, que |
||||
foi deprecado no Python 2.3. |
||||
{% end %} |
||||
|
||||
Existem vários motivos pra isso: |
||||
|
||||
1. As pessoas se acostumaram a escrever código "Pythonico"; a linguagem em si |
||||
não sofreu grandes alterações. |
||||
2. Apesar da linguagem Python ter algumas coisas removidas, essas foram |
||||
lentamente reintroduzidas na linguagem; um exemplo é o operador de |
||||
interpolação de strings (`%`) que havia sido removido em favor do |
||||
`str.format` mas acabou voltando. |
||||
|
||||
A segunda forma para portar seu código para Python 3 é usar a ferramenta |
||||
`2to3`. Ela irá verificar as alterações conhecidas para Python 3 (por exemplo, |
||||
a transformação de `print` para função, a alteração de alguns pacotes da STL) |
||||
e ira apresentar um patch para ser aplicado depois. |
||||
|
||||
Entre as conversões que o `2to3` irá fazer, está a troca de chamadas de |
||||
`iter`-alguma-coisa para a versão sem o prefixo (por exemplo, |
||||
`iteritems()` irá se tornar simplesmente `items()`); `print` será |
||||
convertido para função; serão feitos vários ajustes nas chamadas das |
||||
bibliotecas `urllib` e `urlparse` (estas duas foram agrupadas no Python 3 |
||||
e a primeira teve várias reorganizações internas); `xrange` passa a ser |
||||
`range`; `raw_input` agora se chama `input` e tem um novo tratamento de |
||||
saída, entre outros. |
||||
|
||||
Existe apenas um pequeno problema nessa conversão de Python 2 para Python 3: |
||||
Como pode ser visto na lista acima, alguns comandos existem nas duas versões, |
||||
mas com funcionalidades diferentes; por exemplo, `iteritems()` é convertido |
||||
para simplesmente `items()`, mas os dois métodos existem em Python 2: o |
||||
primeiro retorna um iterador e o segundo retorna uma nova lista com as tuplas |
||||
de todos os elementos do dicionário (no caso do Python 3, é retornado um |
||||
iterador). Assim, apesar do código ser gramaticalmente igual tanto em Python 2 |
||||
quanto Python 3, semanticamente os dois são diferentes. |
||||
|
||||
Esse problema de "comandos iguais com resultados diferentes" pode ser um |
||||
grande problema se o sistema está sendo executado em ambientes que não |
||||
permitem modificação fácil -- por exemplo, o mesmo é executando num Centos 4 |
||||
ou ainda necessita compabilidade com Python 2.6, ambos "problemas" sendo, na |
||||
verdade, requisitos do grupo de infraestrutura. |
||||
|
||||
## Six (e `__future__`) ao Resgate |
||||
|
||||
Para resolver o problema de termos código que precisa executar nas duas |
||||
versões, existe a biblioteca [Six](https://pythonhosted.org/six/); ela faz o |
||||
"meio de campo" entre Python 2 e Python 3 e fornece uma interface para que |
||||
código Python 2 seja portado para Python 3 mantendo a compatibilidade. |
||||
|
||||
Num exemplo (relativamente idiota): |
||||
|
||||
```python |
||||
import collections |
||||
|
||||
class Model(object): |
||||
def __init__(self, word): |
||||
self._count = None |
||||
self.word = word |
||||
return |
||||
|
||||
@property |
||||
def word(self): |
||||
return self._word |
||||
|
||||
@word.setter |
||||
def word(self, word): |
||||
self._word = word |
||||
self._count = collections.Counter(word) |
||||
|
||||
@property |
||||
def letters(self): |
||||
return self._count |
||||
|
||||
def __getitem__(self, pos): |
||||
return self._count[pos] |
||||
|
||||
if __name__ == "__main__": |
||||
word = Model('This is an ex-parrot') |
||||
for letter, count in word.letters.iteritems(): |
||||
print letter, count |
||||
``` |
||||
|
||||
Nesse exemplo, temos uma classe que guarda uma frase e a quantidade de vezes |
||||
que cada letra aparece, utilizando `Counter` para fazer isso (já que `Counter` |
||||
conta a quantidade de vezes que um elemento aparece em um iterável e strings |
||||
*são* iteráveis). |
||||
|
||||
Nesse exemplo, temos os seguintes problemas: |
||||
|
||||
1. `class Model(object)`: em Python 3, todas as classes são "new class" e o |
||||
uso do `object` não é mais necessário (mas não afeta o funcionamento da |
||||
classe); |
||||
|
||||
2. `for letter, count in word.letter.iteritems()` Conforme discutido |
||||
anteriormente, `iteritems()` deixou de existir e passou a ser `items()`; |
||||
`items()` existe no Python 2, mas a funcionalidade é diferente. No nosso |
||||
caso aqui, o resultado da operação continua sendo o mesmo, mas o consumo de |
||||
memória irá subir cada vez que a chamada for feita. |
||||
|
||||
3. `print leter, count`: `print` agora é uma função e funciona levemente |
||||
diferente da versão com Python 2. |
||||
|
||||
Então, para deixar esse código compatível com Python 2 e Python 3 ao mesmo |
||||
tempo, temos que fazer o seguinte: |
||||
|
||||
> `class Model(object)` |
||||
|
||||
Não é preciso fazer nada. |
||||
|
||||
> `print letter, count` |
||||
|
||||
```python |
||||
from __future__ import print_function |
||||
print('{} {}'.format(letter, count)) |
||||
``` |
||||
|
||||
`print` como função pode ser "trazido do futuro" usando o módulo |
||||
`__future__` (apenas disponível para Python 2.7); como a apresentação de |
||||
várias variáveis não é recomenando usando-se vírgulas, usar o |
||||
`str.format` é a forma recomendada. |
||||
|
||||
Uma opção melhor (na minha opinião) é: |
||||
|
||||
```python |
||||
from __future__ import print_function |
||||
print('{letter} {count}'.format(letter=letter |
||||
count=count)) |
||||
``` |
||||
|
||||
Assim, os parâmetros usados na saída são nomeados e podem ser alterados. |
||||
Isto gera um erro estranho quando um nome usado na string de formato não |
||||
for passada na lista de parâmetros do format, mas em strings mais |
||||
complexas, o resultado é mais fácil de ser entendido (por exemplo, eu acho |
||||
mais fácil entender `{letters} aparece {count} vezes` do que `{} aparece {} |
||||
vezes`; ainda, é possível mudar a ordem das variáveis na string de formato |
||||
sem precisar alterar a ordem na lista de parâmetros). |
||||
|
||||
Uma opção melhor ainda é: |
||||
|
||||
```python |
||||
import six |
||||
six.print_('{letter} {count}'.format(letter=letter, |
||||
count=count)) |
||||
``` |
||||
|
||||
Com Six, remove-se a dependência com `__future__` e assim pode-se usar o |
||||
mesmo código em Python 2.6. |
||||
|
||||
> `for letter, count in word.letters.iteritems():` |
||||
|
||||
```python |
||||
import six |
||||
for letter, count in six.iteritems(word.letters): |
||||
``` |
||||
|
||||
Six provê uma interface unificada para iterador de itens tanto em Python 2 |
||||
quanto Python 3: `six.iteritems()` irá chamada `iteritems()` se estiver |
||||
rodando em Python e `items()` se estiver rodando com Python 3. |
||||
|
||||
E, assim, nosso código relativamente idiota agora é compatível com Python 2 e |
||||
Python 3 roda de forma idêntica nos dois. |
||||
|
||||
Mas vamos para um exemplo real: |
||||
|
||||
```python |
||||
import urllib |
||||
import urlparse |
||||
|
||||
def add_querystring(url, querystring, value): |
||||
frags = list(urlparse.urlsplit(url)) |
||||
query = frags[3] |
||||
query_frags = urlparse.parse_qsl(query) |
||||
query_frags.append((querystring, value)) |
||||
frags[3] = urllib.urlencode(query_frags) |
||||
return urlparse.urlunsplit(frags) |
||||
|
||||
if __name__ == "__main__": |
||||
print add_querystring('http://python.org', 'doc', 'urllib') |
||||
print add_querystring('http://python.org?doc=urllib', |
||||
'page', '2') |
||||
``` |
||||
|
||||
{% note() %} |
||||
Sim, sim, o código poderia ser um simples "verificar se tem uma interrogação na |
||||
URL; se tiver, adicionar `&` e a query string; se não tiver, adicionar `?` e a |
||||
query string". A questão é: dessa forma, eu consigo fazer uma solução que vai |
||||
aceitar qualquer URL, em qualquer formato, com qualquer coisa no meio porque as |
||||
bibliotecas do STL do Python vão me garantir que a mesma vai ser parseada |
||||
corretamente. |
||||
{% end %} |
||||
|
||||
Esse é um código de uma função utilizada para adicionar uma query string em |
||||
uma URL. O problema com essa função é que tanto `urlib` |
||||
quanto `urlparse` sofreram grandes modificações, ficando, inclusive, sob o |
||||
mesmo módulo (agora é tudo `urllib.parse`). |
||||
|
||||
Para fazer esse código ficar compatível com Python 2 e 3 ao mesmo tempo, é |
||||
preciso usar o módulo `six.moves`, que contém todas essas mudanças de escopo |
||||
das bibliotecas da STL (incluindo, nesse caso, a `urllib` e `urlparse`). |
||||
|
||||
```python |
||||
import six |
||||
|
||||
def add_querystring(url, querystring, value): |
||||
frags = list(six.moves.urllib.parse.urlsplit(url)) |
||||
query = frags[3] |
||||
query_frags = six.moves.urllib.parse.parse_qsl(query) |
||||
query_frags.append((querystring, value)) |
||||
frags[3] = six.moves.urllib.parse.urlencode(query_frags) |
||||
return six.moves.urllib.parse.urlunsplit(frags) |
||||
|
||||
if __name__ == "__main__": |
||||
six.print_(add_querystring('http://python.org', 'doc', 'urllib')) |
||||
six.print_(add_querystring('http://python.org?doc=urllib', |
||||
'page', '2')) |
||||
``` |
||||
|
||||
O que foi feito, aqui, foi usar `six.moves.urllib.parse`. Essa estrutura não |
||||
vêm por acaso: no Python 3, as funções de `urlparse` agora se encontram em |
||||
`urllib.parse`; Six assumiu que a localização correta para as funções dentro |
||||
"de si mesma" seriam os pacotes utilizados no Python 3. |
||||
|
||||
E, assim, temos dois exemplos de programas que conseguem rodar de forma igual |
||||
tanto em Python 3 quanto Python 2. |
||||
|
||||
Ainda, fica a dica: Se houver algum software que você utiliza que não roda |
||||
corretamente com Python 3, utilizar o Six pode ajudar a manter o código atual |
||||
até que uma escrita resolva o problema. |
||||
|
||||
## Outras Perguntas |
||||
|
||||
### Como fica a questão de ficar sempre com o Six? |
||||
|
||||
Boa parte das aplicações hoje botaram uma "quebra" do suporte às suas versões |
||||
que rodam em Python 2. Por exemplo, Django anunciou que em 2020 vai sair a |
||||
versão 2.0 do framework e essa versão vai suportar Python 3 apenas. |
||||
|
||||
## Quão difícil é portar para Python 3? |
||||
|
||||
Não muito difícil -- agora. Muitas das coisas que foram removidas que davam dor |
||||
de cabeça na conversão retornaram; o caso mais clássico é o que operador de |
||||
interpolação de strings `%`, que foi removido e teria que ser substituído por |
||||
`str.format`, mas acabou retornando. Outro motivo é que os scripts são mais |
||||
"pythônicos" atualmente, muito por causa de gente como [Raymond |
||||
Hettinger](https://rhettinger.wordpress.com/), que tem feito vídeos excelentes |
||||
de como escrever código em Python com Python (ou seja, código "pythônico"). E, |
||||
como anedota pessoal, eu posso comentar que meu código de 2003 rodou com |
||||
`python -3` sem levantar nenhum warning. |
@ -0,0 +1,102 @@
|
||||
+++ |
||||
title = "The Day I Found My Old Code" |
||||
date = 2015-12-18 |
||||
category = "code" |
||||
|
||||
[taxonomies] |
||||
tags = ["code", "python", "pep8", "pylint", "en-au"] |
||||
+++ |
||||
|
||||
Found a piece of code I wrote 2 years ago, following a lot of linters. I'm |
||||
amazed how readable the code still is. |
||||
|
||||
<!-- more --> |
||||
|
||||
Today, walking across a client repository, I found a module I wrote two years |
||||
ago in Python. At the time, we lacked the knowledge to write proper tests, but |
||||
we used a lot of other tools: PEP8 and Pylint, mostly. |
||||
|
||||
Today-me is pissed with two-years-ago-me for the lack of tests, but where my |
||||
memory forgot the nuances of the project, the huge amount of comments and |
||||
proper documentation makes it for. |
||||
|
||||
For example, every pylint disable have an explanation about why it was |
||||
disabled: |
||||
|
||||
```python |
||||
# flask has a weird way to deal with extensions, which work fine but confuses |
||||
# the hell out of PyLint. |
||||
``` |
||||
|
||||
Related modules are loaded in sequence, with line breaks between different |
||||
sources: |
||||
|
||||
```python |
||||
from flask.ext.babel import Babel |
||||
from flask.ext.babel import refresh |
||||
|
||||
from flask.ext.gravatar import Gravatar |
||||
|
||||
from werkzeug.routing import NotFound |
||||
from werkzeug.routing import RequestRedirect |
||||
``` |
||||
|
||||
Every variable, every function, is documented in proper Sphinx format, which |
||||
contributes to understanding what the variable/function do: |
||||
|
||||
```python |
||||
#: Session duration time |
||||
#: The time is given as number and a time interval ("m" for minutes, "h" for |
||||
#: hours, "d" for days and "w" for weeks), e.g., "3d". A value of "None" will |
||||
#: make the session last till the user closes the browser. |
||||
SESSION_EXPIRATION = "1d" |
||||
``` |
||||
|
||||
```python |
||||
def reroute(route): |
||||
"""Route control. The route must exist in the known routes list to return |
||||
a valid result; unknown routes will be redirected to the 404 page; if the |
||||
route exists but it's marked as "maintenance", the request will be |
||||
redirected to the 503 page.""" |
||||
``` |
||||
|
||||
Also, I found a class with a docstring of about 20 lines. It explains every |
||||
single parameter in its `__init__` function, which makes perfect sense when |
||||
you generate the documentation. |
||||
|
||||
Where the functions lacked a good name (due having a good name inside their |
||||
own objects/modules), a comment was added to explain what the function was |
||||
actually doing: |
||||
|
||||
```python |
||||
inject(current_app) # inject values if run stand-alone modules |
||||
load_routes(current_app) # load the routing information |
||||
register_filters() # register jinja filters |
||||
register_functions() # register jinja functions |
||||
register_tests() # register jinja tests |
||||
set_session_time() # define the cookie time |
||||
``` |
||||
|
||||
Also, I had the slight habit of putting large comments in the code when |
||||
something was kinda hacky: |
||||
|
||||
```python |
||||
# Now you're asking yourself: "Why heuristic find?" The reason is |
||||
# simple: in _function() , we add a new endpoint on top of one |
||||
# existing endpoint; because we do that on top of anything, we don't |
||||
# know, for sure, which one of the parameters the user (the other |
||||
# programmer, in this case) used in their URLs. So we need to through |
||||
# all parameters they expect to receive in their detail function in |
||||
# the hopes of finding something that actually matches a "pk". |
||||
``` |
||||
|
||||
It doesn't make much sense here, but believe me, it works. I was just reading |
||||
the code with a function called `heuristic_find` and I was "Man, which drugs I |
||||
took to call it 'heuristic_find'?" And BOOM, there it was why it was called |
||||
like that. |
||||
|
||||
Ok, honesty time: I wasn't the only one writing this code. But thanks to the |
||||
client input, I started and enforced all those rules (and wrote a huge part of |
||||
the base code), the code is still readable two years later. |
||||
|
||||
Yeah, I'm proud of it. |
@ -0,0 +1,43 @@
|
||||
+++ |
||||
title = "The Sad Life of Walter Mitty" |
||||
date = 2015-03-28 |
||||
category = "thoughts" |
||||
|
||||
[taxonomies] |
||||
tags = ["movies", "the secret life of walter mitty", "rethink", "review", "en-au"] |
||||
+++ |
||||
|
||||
I once wrote about [The Secret Life of Walter |
||||
Mitty](http://juliobiason.net/2014/11/13/the-secret-life-of-walter-mitty-2013/) |
||||
and how nice story about a guy outgrowing his daydreams. |
||||
|
||||
<!-- more --> |
||||
|
||||
But today I realized I see everything wrong. |
||||
|
||||
The second time I watched the movie, in the scene Walter talks to Todd (from |
||||
E-Harmony) on the top of the Himalayas, I thought "Well, that's one hell of a |
||||
mobile company, they have signal on the top of Himalayas". |
||||
|
||||
The third time I realized *how* the signal was that good: Walter never went to |
||||
the Himalayas. |
||||
|
||||
Let's assume that, in the story, Walter really went to Greenland and Iceland |
||||
and came back. And then he got fired by losing the negative #25. This is where |
||||
I believe everybody is tricked. At that point, Walter actually lost his only |
||||
connection to the real life (his job) and descend into a full time illusion. |
||||
|
||||
That's why the recluse Mitty went to Afghanistan and had to give cake to guys |
||||
with guns. That's how his call is crystal clear on the top of Himalayas. That's |
||||
why Tood, who never really knew Walter, went to the airport to rescue him. |
||||
That's how his E-Harmony profile suddenly was the hottest profile ever. That's |
||||
why the piano check was so large, so he wouldn't need to worry about his |
||||
unemployed life. That's why his mom saved the wallet. This how he finally |
||||
manages to face Ted. That's why Cheryl is right there when he gets his |
||||
severance check. And that's how his damn face appears on the cover of Life. |
||||
That even explains why Sean never took the picture -- if there is no picture, |
||||
there is nothing to show that the whole thing was a dream. |
||||
|
||||
When you pick the "he's in complete disconnection with reality and he lives in |
||||
his imagination now", the whole ending stops being a succession of lucky |
||||
happenings and starts to make sense. A sad sense, but a sense, nonetheless. |
@ -0,0 +1,66 @@
|
||||
+++ |
||||
title = "When I Used PEP8 To Fuck Up Code" |
||||
date = 2016-07-19 |
||||
|
||||
category = "code" |
||||
|
||||
[taxonomies] |
||||
tags = ["python", "pep8", "readability", "en-au"] |
||||
+++ |
||||
|
||||
We "inherited" some Python code recently. Although another team was working on |
||||
it, we now should support it and keep it going. The previous team at least |
||||
tried to use Pylint and follow PEP8. And I say "tried" because their |
||||
`pylintrc` has a couple of exceptions and their PEP8 extended the |
||||
maximum column to 100. |
||||
|
||||
<!-- more --> |
||||
|
||||
{% note () %} |
||||
Pylint exceptions are almost common case these days, specially in |
||||
a Django project. But plain, pure `pylintrc` exclusion without giving any |
||||
pointers on *why* you're adding that exception are dumb, IMHO. I had a |
||||
project were we decided to add pylint exceptions inside the code, but for |
||||
every exception there should be a comment preceeding it explaining the |
||||
reason for the exception ("the framework doesn't expose this directly", |
||||
"pylint can't see this variable, but it is there", "It's the common place |
||||
to name the variable this way" and so on). |
||||
{% end %} |
||||
|
||||
Quick pause here 'cause I know a bunch of people will complain with a "But |
||||
monitors these days are very large and you don't need to focus on column 80; |
||||
we don't use CGA anymore, old person!". The thing about the maximum column at |
||||
80 is *not* about "being visible on every CGA" but actually a measure of |
||||
readability: If you speak shorter, concise sentences, people will get the idea |
||||
quickly; if you keep an stream of words non-stop without reaching a conclusion |
||||
and without any punctuation to keep the ideas flowing, you will end up with |
||||
something that it is easier to forget and which the central idea will be lost |
||||
(and I freaking hope you got what I just did). It's tiring to read a very long |
||||
sentence; it's easy to keep the context on a short sentence. |
||||
|
||||
In the spirit of "proper" PEP8, I reformatted one of the failing tests |
||||
to follow the 80 column limit. And now the code looks like crap. And |
||||
I'll commit like that. It's not because I hate my coworkers, but to point out |
||||
that, because it's a pain to read, it means the structured of the code is too |
||||
complex. If someone comes and say "damn, this test is hard to read", I'll be |
||||
able to point that it is not the test that it is hard to read, but the code |
||||
that reached the point where its complexity is leaking to the test code; it is |
||||
now a good time to refactor this to simplify things and make them easier to |
||||
read. |
||||
|
||||
{% note() %} |
||||
Actually, the reason for it to fail is too damn fun and worth a proper blog |
||||
post about it. Stay tunned! |
||||
{% end %} |
||||
|
||||
Not that we can simply stop working and fix the damn architecture of it, but we |
||||
can at least keep this beast around till everybody gets pissed and realize it |
||||
*desperately needs* a refactor. |
||||
|
||||
{% note() %} |
||||
Weird thing, people usually assume some countries are the center of bad code; |
||||
this baseline is coming from a "first world country" and, heck, it has one of |
||||
the worst designs I ever saw. I'll not name names here to protected the (maybe) |
||||
innocent. But in the second week of training, I realized this whole project |
||||
has, at least, 6 months of technical debt already. |
||||
{% end %} |
Loading…
Reference in new issue