diff --git a/content/agile-vs-culture.md b/content/agile-vs-culture.md new file mode 100644 index 0000000..6e07fc3 --- /dev/null +++ b/content/agile-vs-culture.md @@ -0,0 +1,59 @@ ++++ +title = "Agile vs Culture: The Story of Outliners" +date = 2015-12-18 +category = "thoughts" + +[taxonomies] +tags = ["agile", "book", "empowerment", "disenfranchise", "en-au"] ++++ + +When the culture goes againt agile. + + + +![The Agile cycle](/agile.jpg) + +In some recent agile conferences I went this year, I've been recalling and +telling one story from +[Outliners](https://www.goodreads.com/book/show/3228917-outliers) +(which I wrongly assumed it was part of +[Freakonomics](https://www.goodreads.com/book/show/1202.Freakonomics) +about the number of accidents in Asian and South American +airlines. The book points that there is a cultural difference between those +two and American people, in which the former see a larger distance between +them and their superiores than the later. + +Why I keep recalling this? Because in agile teams, there is no hierarchy: the +PO is as important as the junior developer; the tester has the same input +value as the senior developer. This means that the team doesn't need to wait +for someone higher in the chain to make a decision: the team is free to make +their own decisions on how to better reach the value requested by the PO. + +In all events I went, there is a constant problem on "how do I make my team +see the value in Agile" and "why Agile doesn't work". Again, it seems that +Agile goes straight against the cultural reference South Americans  -- in this +case, me and my colleagues  --  because we are cultural trained about that guy +who is in a higher place in the chain and, thus, I depend on him on the +important questions (for whatever value of "important" I believe a solution +is).  + +In the end, it's not as much as changing a company development model and +explaining to managers and directors on how the software  -- and its value + -- will be delivered, but fighting against the cultural norm of having +someone in a very high place that can make decisions while people think they +are very low in the chain to make a decision. Not counting the constant fear +of being wrong (which is actually good in agile). + +The problem revolves not only on this point, but also in the assumed position +based on role name. Someone will assume that because their position is +"developer", it means that they are below  --  and receive orders from  --  +the PO; someone will assume that because someone's else role is tester and +their are designed as developer, they are up in hierarchy and, thus, can order +the tester to do whatever they think it must be done. + +Here we have a second problem: we need to detect and empower those who think +they are below in the chain and "disenfranchise" those who think they are +above everyone else due the role name. + +My plan for 2016 is to read some books about those topics and bring this +dicussion to future events. Which me luck. ;) diff --git a/content/agile.jpg b/content/agile.jpg new file mode 100644 index 0000000..e935f1c Binary files /dev/null and b/content/agile.jpg differ diff --git a/content/couchbase-example-and-rest.md b/content/couchbase-example-and-rest.md new file mode 100644 index 0000000..d575d9d --- /dev/null +++ b/content/couchbase-example-and-rest.md @@ -0,0 +1,238 @@ ++++ +title = "Couchbase Example and REST" +date = 2016-01-12 +category = "code" + +[taxonomies] +tags = ["rest", "couchbase", "example", "restful", "en-au"] ++++ + +Using the example Couchbase to show how REST works. + + + +Let me start this by pointing that I'm a RESTnazi: I'm the kind of guy that +will get into a fight with anyone that says things like "Ok, that's because +this is just REST, not RESTful" because... well, because, there is no +diference between REST and RESTful. + +And today I found something weird while reading +[the Couchbase documentation](http://developer.couchbase.com/documentation/server/4.1/travel-app/travel-app-walkthough.html) +with them claiming that their example is REST while... well, it isn't. + +But hey, that's a good opportunity to explain a bit what is REST (and what is +not). + +## What is REST? + +REST is an architecture/design pattern/pick your buzzword built on top of HTTP +to provide information. It has two components: + +* **Resources**: That's the elements in your system: Your users, your books, + your airports, your flights and such. + +* **Verbs**: Those are the things you do with your resources: You GET them, + you UPDATE them, and so on. + +There is no true "guideline" on how to write resources. It's usually done with +nouns in their plural form (or, at least, that's what [Apigee](http://apigee.com/about/) +concluded after checking a bunch of APIs around). Those resources are mapped +through URLs with some base. + +Let's pick the example from Couchbase: It's a travel app, with airports, +flights and flight paths. We could use a base URI scheme of +`/travel/api/v1.0/` because: + +1. The travel app could also provide a user interface through `/travel/`, so + we keep the API endpoint on `api` to not mix things. + +2. We are versioning the API (here, v1.0). This is a recommendation from + Apigee and, again, not part of the architecture/design patter/buzzword. + +On the top of this base URI, we'll build our resource URLs: + +* `/travel/api/v1.0/airports/` and +* `/travel/api/v1.0/flights/` + +"Where is the flight path endpoint?", you must be asking? Well, I'll tell you +later about it, hold on a second, but we'll use those two to explain the very +basic of REST first, ok? + +Besides those two URIs, we need two more: One for each resource to access +direct elements. So now we have: + +* `/travel/api/v1.0/airports/`; +* `/travel/api/v1.0/airports/{airport_id}`; +* `/travel/api/v1.0/flights/` and finally +* `/travel/api/v1.0/flights/{flight_id}`. + +So, now that we have our resources, we need ways to manage their contents. For +this, we use the "verbs" I mentioned before. The thing about rest is that +those actions are directly tied to the HTTP verbs: + +* **GET** will retrieve elements in the resource; +* **POST** will insert a new element in the resource; +* **PUT** is used to update the information of an element [#put]_; +* **DELETE** is used to remove an element from the resource [#delete]_. + +{% note () %} +If you want an easy mnemonic, "PUT" has and "U", for "update". Yes, it's silly, +but it works (at least, for me). Also, a "PUT" directly on a resource means +"replace the whole database with this information" and, thus, not not really +widespread. +{% end %} + +{% note() %} +You can add a DELETE for your whole resource, if you're crazy and bold enough. +{% end %} + +And adding those two we have: + +* Get a list of all airports: `GET /travel/api/v1.0/airports/` +* Add a new airport: `POST /travel/api/v1.0/airports/` +* Get information of a single airport: `GET /travel/api/v1.0/airport_3577` +* Update the information of an airport: `PUT /travel/api/v1.0/airport_3577` + +... and so on. + +Easy as pie, right? + +## The "Flight Path" resource + +Now let's go back to the "flight path" resource, which I left behind. Thing +is, a flight path does not exist on its own. If a flight doesn't exist, the +flight path doesn't exist either, right? And if I flight exist, it should have +a path, right? + +So a flight path is a resource linked directly to our resource of flights. For +this, REST allows resource chaining by just adding another layer on top of +existing URIs. As we pointed before, a flight path **needs** a flight (a +flight *element*, just to make more clear where I'm going for with this), so +we should build the resource on top of an element URI: + +* `/travel/api/v1.0/flights/airline_24/paths` and +* `/travel/api/v1.0/flights/airline_24/paths/{path_id}` + +... although the last one only makes sense if a flight could have two (or +more) different paths, which would make sense if it goes one way in a path and +goes back in a different path, which I do not know enough about flights to +know if this is possible, but for the sake of explaining everything about +REST, let's go with it, mkay? + +And now you may be wondering: Why not simply do +`/travel/api/v1.0/flightpaths/{path_id}`? Again, because flight paths are tied +to flights, this means the base resource for the flight won't even exist and, +thus, it's sub-resources won't be available, which makes a lot of sense. + +## Filtering results + +Ok, now we know how to retrieve all airports, which is nice, but we don't want +them all: the user will type something and we'll show them only the airports +that match their search. We could screw the user and send the whole list to +them and let the application filter it locally, abusing the user bandwidth and +CPU power -- which isn't nice, since we have a database on our side that can +do this filtering faster. + +Because we can use URIs only to point to resources and resource elements, we +need a different way of passing this to the server. And guess what? HTTP have +the proper way to do this: querystrings and forms. + +Querystrings, for those unfamiliar with HTTP, are the things can come after +the "?" in the URL. For example, in the URL: +"`http://example.com/sayname?name=julio`", "`name=julio`" is the querystring. +It provides a key ("name") and a value ("julio"). Forms are basically the +same, but instead of being part of the URL, they are sent in the body of the +HTTP request (and can be much, much larger than querystrings). + +There is one more thing about querystrings and forms: The only way to send +information to the server in a `GET` request is through querystrings, since +GETs do not have a body. DELETEs can have a body, but the RFC says it should +be ignored. POST and PUT do have bodies and, thus, information about the +element to be added/updated should come in there. + +So, for filtering, we could have a "filter" querystring to filter elements. +Couchbase filters airports with a single querystring, so we could simply do + +`GET /travel/api/v1.0/airports/?filter=` + +So the user will see a bunch of airports with their input. And, since we have +all the airport, we could also link the flights as a subresource of it, with: + +`GET /travel/api/v1.0/airports//flights/` + +... which we didn't mention before, but it is now making sense, right? + +Couchbase example also allows showing which flights connect two airports and +the REST way is, again, using querystrings: + +`GET /travel/api/v1.0/airports//flights/?connectedTo=` + +And, if you want to be nice enough, you could even add a "fields" parameter, +so your API consumers could filter out fields they don't want in the results, +to reduce the bandwidth required. But it's all up to you. + +Weird how things make absolute sense here, and we never called the "flights" +resource, right? That's one of the things about REST: you build resources in a +way that make sense for the **consumer** of the API, not to reflect your +database. + +## Pagination + +Just for the sake of completeness, let's talk a bit about pagination. + +Pagination, in REST, works for getting all the elements in the resource, so +it's used in the GET request for the resource. And, because it's part of the +GET request, it should come in the querystring. + +There are a couple of ways of doing pagination, in this case: + +* Let the consumer specify page size and page count: In this case, you could + have a query string like `count=15&page=2` to retrieve the elements from the + second page of 15 elements each. This is the most common way of doing + pagination and Twitter is one good example of this. + +* Have a hardcoded pagesize: Same as before, but the only option available is + `page=2`. + +* Have the consumer specify the last seen element and page size. So the first + request would have something like `count=15` to retrieve the first 15 + elements, but the next request would have the last element in the list as a + parameter, like `count=15&lastSeen=16` and the server would return all + elements that come after the element with id "16". This prevents duplication + in the results in case a new element is added. Reddit uses this in their + API. + +## The type of response + +Again, for the sake of completeness, you may have noticed that not even once I +mentioned the type of data to be returned in each step. That's because REST +does not have a format: You could build a whole service that returns HTML +pages in REST format, and that's ok; you could return JSON, which the +Couchbase documentation points correctly that it is the most widely used +format; you could return XML; if you're crazy enough and want to return in +COBOL format, go for it! + +## So, where the example fails to be REST? + +1. All paths are marked with "findAll". "findAll" is **not** a resource and, + thus, shouldn't be in the URL. + +2. As I pointed, flight paths are actually a sub-resource of flights and + should be linked. Flight paths should **not** exist if the flight doesn't + exist. + +The flight path query uses querystrings to retrieve the information for paths +that go through two airports, which is the right way of doing, but again, it +shouldn't be a resource on itself. + +## How to fix the documenation + +Easy way? Remove the "REST" mention in the pages. I *am* nitpicking the word +"REST" there, I fully reckonize it, and I understand that for the sake of +example it doesn't have to be REST, but it seems wrong to tell people +something is REST when it isn't. + +If Oracle decided to say "we added a field type that can store huge amounts +of JSON data, and although you can't query its content, we can now say +OracleDB is a NoSQL database", people would lose their minds. But that's kinda +like I'm feeling about this whole thing. diff --git a/content/dead-github-maintainers.md b/content/dead-github-maintainers.md new file mode 100644 index 0000000..3d4796f --- /dev/null +++ b/content/dead-github-maintainers.md @@ -0,0 +1,53 @@ ++++ +title = "Dear Github Maintainers" +date = 2016-01-15 +categories = "code" + +[taxonomies] +tags = ["github", "comments", "en-au"] ++++ + +A rebuttal to "Dear Github". + + + +So recently in Reddit, there is this thread going around about +[Dear Github](https://github.com/dear-github/dear-github), +which points some problems with Github issues pages. + +Thing is, most of the problems are not problems with Github itself, but by the +community that grew around it. + +For example, the most annoying one is the huge amount of "+1" in comments. I've +seen this and yes, it's annoying as hell. Lots of people come around and post +a simple "+1" instead of really contributing. This is *not* an issue with +Github, it is an issue with the community that instead of helping fixing a +problem, thinks that posting "+1" to point that it is important to them is +actual help. It isn't. I've seen issues with so many "+1" that if everyone +who posted a "+1" actually submitted a single change, the bug would be fixed +with spare lines. + +(Unpopular opinion: Github should have support for "+1", but actually *ban* it. +It is unhelpful. If it's important to you, you should at least give a try to +fix the issue instead of "+1" and giving yourself a pat in the back for +"helping out".) + +Issues missing important information surely is a problem, but that's why you +need to triage your issues. Is there any missing information? You can reply to +the poster. "But why should I ask when I can put a form for the user to fill +issues?" Dude, seriously? You're worried that you will lose 30 seconds of your +life to ask something? Why don't you want to talk to your community, why you +don't want to teach people how to properly report errors? Is it that hard to +be part of a community? + +But the hurting point is the "if Github was open source, we would fix this +ourselves". [Gitorious](https://en.wikipedia.org/wiki/Gitorious) was open +source and never had that much contribution from the community, to the point +it was closed and moved to Gitlab. So I have to ask: If Bitbucket implemented +this, would all of you move to it? My guess is an indignant "No", because +Github means exposure while all the other public Git sites are not. + +To me, the whole list is not a list of problems with Github itself, but a +problem with the open source (in the general, broad term) community that's +growing around Github. We should worry about building communities, not building +code with 400 forks, 1000s of "+1" comments and a single maintainer. diff --git a/content/juliobiason.net-3.0.md b/content/juliobiason.net-3.0.md new file mode 100644 index 0000000..01903d1 --- /dev/null +++ b/content/juliobiason.net-3.0.md @@ -0,0 +1,39 @@ ++++ +title = "Announcing JulioBiason.Net 3.0" +date = 2015-02-18 +category = "announcements" + +[taxonomies] +tags = ["meta", "blog", "pelican", "en-au"] ++++ + +Short version: New blog URL, engine and layout. + + + +Long version: For a long time already, I've been thinking about using a static +blog generator. Not that there is anything wrong with dynamic blog engines (and +I'm a long time [WordPress](https://wordpress.org/) user, without any issues, +specially since my hosting company -- [Dreamhost](http://www.dreamhost.com/) -- +offers easy updates), but... I don't know, I think it's easy to automate some +stuff when all you have are basic files, with no API to talk to. + +So, here it is. A new blog URL, so all old posts are still visible in their +original paths (although this will be a problem in the future when I decide to +launch a 4.0 blog, but that's a problem for the future); a new engine, as +WordPress is not static, so I decided to go with +[Pelican](http://blog.getpelican.com/), simply because I know Python (I know +there is a huge community for [Jekyll](http://jekyllrb.com/), but I'm not a +Ruby guy and I don't want to be a Ruby guy); and finally a new layout, as I +took everything I've been playing with [Zurb +Foundation](http://foundation.zurb.com/) and, since I'd automagically gain a +responsive layout, I did just that. And yes, the +[theme](https://bitbucket.org/juliobiason/pelican-fancy-foundation) is my +creation -- and that's why there is a bunch of broken stuff. I'll be fixing +them in the future, as I see them -- or someone reports them to me. + +PS: There is actually a hidden thing, some [things I don't want to deal +again](http://juliobiason.net/2008/02/23/why-half-life-2-failed/), which could +probably crippling me in what to write (hence why the content was so dull and +boring in the last few months). Because static blogs don't have comments, I may +feel fine in finally discussing them. diff --git a/content/mocking-a-mock.md b/content/mocking-a-mock.md new file mode 100644 index 0000000..93489b2 --- /dev/null +++ b/content/mocking-a-mock.md @@ -0,0 +1,114 @@ ++++ +title = "Mocking A Mock" +date = 2016-07-21 +category = "code" + +[taxonomies] +tags = ["python", "mock", "mongodb", "find", "count", "en-au"] ++++ + +Mocks are an important part of testing, but learn how to properly mock stuff. + + + +A few weeks ago we had a test failing. Now, tests failing is not something +worth a blog post, but the solution -- and the reason it was failing -- is. + +A few background information first: The test is part of our Django project; +this project stores part of the information on MongoDB, because the data is +schemaless -- it comes from different sources and each source has its own +format. Because MongoDB is external to our project, it had to be mocked +(sidenote: mocks are there exactly to do this: the avoid having to manage +something external to your project). + +PyMongo, the MongoDB driver for Python, has a `find()` function, pretty much +like the MongoDB API; this function returns a list (or iterator, I guess) with +all the result records in the collection. Because it is a list (iterator, +whatever), it has a `count()` function that returns the number of records. So +you have something like this: + +```mongodb +connector.collection.find({'field': 'value'}).count() +``` + +(Find everything which has a field named "field" that has a value of "value" +and count the results. Pretty simple, right?) + +The second hand of information you need is about the `mock` module. Python 3 +has a module for mocking external resources, which is also available to Python 2. +The interface is the same, so you can +[refer to the Python 3 documentation](https://docs.python.org/dev/library/unittest.mock.html) +for both versions. + +An usage example would be something like this: If I had a function like: + +```python +def request(): + return connector.collection.find({'field': 'value'}) +``` + +and I want to test it, I could this: + +```python +class TestRequest(unittest.TestCase): + @patch("MyModule.connector.collection.find") + def test_request(self, mocked_find): + mocked_find.return_value = [{'field': 'value', 'record': 1}, + {'field': 'value', 'record': 2}] + result = request() + self.assertDictEqual(result, mocked_find.return_value) +``` + +Kinda sketchy for a test, but I just want to use to explain what is going on: +the `@patch` decorator is creating a stub for any call for +`MyModule.connector.collection.find`; inside the test itself, the stub is +being converted to a mock by setting a `return_value`; when the test is run, +the mock library will intercept a call to the `collection.find` inside +`MyModule.connector` (because that module imported PyMongo driver to its +namespace as `connector`) and return the `return_value` instead. + +Simple when someone explains like this, right? Well, at least I hope you got +the basics of this mocked stuff. + +Now, what if you had to count the number of results? It's pretty damn easy to +realize how to do so: just call `count()` on the resulting list, or make it +return an object that has a `count()` property. + +The whole problem we had was that the result of `find()` was irrelevant and +all we wanted was the count. Something like + +```python +def has_values(): + elements = connector.collection.find({'field': 'value'}).count() + return elements > 1 +``` + +First of all, you can't patch `MyModule.connector.collection.find.count` +because you'll only stub the `count` call, not `find`, which will actually try +to connect on MongoDB; so the original patch is required. And you can't patch +both `find` and `count` because the first patch will return a new `MagicMock` +object, which will not be patched (after all, it is *another* object). The +original developer tried to fix it this way: + +```python +mocked_find.count.return_value = 0 +``` + +... which, again, doesn't work because the call to `find()` will return a +`MagicMock` that doesn't have its `count` patched. But the developer never +realized that because `MagicMock` tries its best to *not* blow up your tests, +including having return values to conversions like... int. And it will always +return 1. + +Is your head spinning yet? Mine sure did when I realized the whole mess it was +being made. And let me repeat this: The problem was *not* that MongoDB was +being mocked, but that it was being *mocked in the wrong way*. + +The solution? As pointed above, make `find` return an object with a `count` +method. + +```python +count_mock = MagicMock(return_value=0) +mocked_find.return_value = MagicMock( + **{'count': count_mock}) +``` diff --git a/content/pre-order-the-case-of-no-mans-sky.md b/content/pre-order-the-case-of-no-mans-sky.md new file mode 100644 index 0000000..bededc6 --- /dev/null +++ b/content/pre-order-the-case-of-no-mans-sky.md @@ -0,0 +1,60 @@ ++++ +title = "Pre-Orders: The Case of No Man's Sky" +date = 2016-08-25 +category = "thoughts" + +[taxonomies] +tags = ["pre-order", "grim dawn", "no man's sky", "en-au"] ++++ + +[No Man's Sky](http://www.no-mans-sky.com/) is getting a lot of heat recently +because, well, the game is not all what the developers promised. And a lot of +people are putting the blame on pre-orders and whatnot. + + + +Thing is, this is not a problem with pre-orders. This is a problem with a +development company not getting up with the times. + +Example: [Grim Dawn](http://www.grimdawn.com/). Although not a pre-order thing +per-se, the game was on a Kickstart. Today, the game is polished, fun, have +lots of stuff to do but nowhere there is someone claiming this "pre-order" +thing ruined the game. + +The difference between Grim Dawn and No Man's Sky is that Crate, the +developers of the first, continuously delivered versions to get feedback. +Falling through the world? Ok, we can fix. Game doesn't run on your rig even +when you have the minimal specs? There is something wrong with our engine. +That feature? Yeah, it's too big for now, we'll work on it later. + +Not on this list, but ArenaNet did something close to that with Guild Wars 2: +People who pre-purchased the game -- an "extreme" version of pre-order -- +could participate on the closed beta events. Those events, although not +spawning through whole maps, would allow players to experience some part of +the game and return feedback. They would even claim "we just want to stress +the servers, so weird things could happen" and people were fine with that. + +Hello Games, on the other hand, did all development behind closed doors. Sure +they are a small company, but there was nothing stoping them from actually +doing some open beta test or whatever to receive feedback. Well, except on +thing: Sony. + +Sony injected money on Hello Games for their first title (I was about to claim +"a lot of money" but heck if I know how much they funded) and wanted it on +their console. Now, consoles do not have a "here, play for testing" or "signup +for this and we'll add your console ID in our database and you can download +the game". To keep the things hyped, no one could see the game before release. +No previews, no betas, no nothing. Feature wasn't fun? Who would know, it's +scrapped now. The engine blows up on certain configurations? Only way to check +this is after the final release. + +So, again, it's not a problem with "Pre-orders are bad and you should feel +bad". This is a problem with a company not keeping up with the times. A lot of +companies are now sharing things beforehand to get larger feedback than their +friends and family: Microsoft is continuously releasing Windows versions +through their "Microsoft Insider" program, which anyone can join; Apple is +giving betas of all their OSes for anyone who wants to test them. The idea of +"many eyes makes all bugs shallow" finally caught up and people realised it +was right. + +But, apparently, Hello Games + Sony didn't. diff --git a/content/python-2-3-six.md b/content/python-2-3-six.md new file mode 100644 index 0000000..0ca698c --- /dev/null +++ b/content/python-2-3-six.md @@ -0,0 +1,312 @@ ++++ +title = "Python 2 + 3 = Six" +date = 2016-11-21 + +category = "code" + +[taxonomies] +tags = ["python", "six", "python 2", "python 3", "tchelinux", "pt-br"] ++++ + +"Six" é uma pequena biblioteca Python que pode ajudar você a passar o seu +código de Python 2 para Python 3. + + + +{% note() %} +(Esse post é relacionado com a apresentação que eu fiz no dia 19 de novembro +no TchêLinux. Os slides podem ser encontrados +[na área de apresentações](http://presentations.juliobiason.net/python23six.html).) +{% end %} + + +Antes de mais nada, uma coisa que precisamos responder é: Porque alguém usaria +Python 3? + +* Todas as strings são unicode por padrão; isso resolve a pilha de problemas + macabros, chatos, malditos, desgraçádos do `UnicodeDecodeError`; +* `Mock` é uma classe padrão do Python; ainda é possível instalar usando `pip` e + a sintaxe é exatamente igual, mas é uma dependência a menos; +* `Enum` é uma classe padrão do Python; Enum é um dos abusos mais + interessantes de classes em Python e realmente útil; +* AsyncIO e toda a parte de lazy-evaluation que o Python 3 trouxe; muita coisa + no Python 3 deixou de ser "gerar uma lista" para ser um retorno de um + iterador ou um generator; com AsyncIO, tem-se um passo a frente nessa idéia + de geração lazy das coisas e, segundo pessoas mais inteligentes que eu, com + PyUV, o Python consegue ser tão ou mais rápido que o Node; +* E, principalmente, **o suporte ao Python 2 termina em 2020!** + +{% note() %} +Existe ainda a interpolação de strings com o novo identificador `f`; a +funcionalidade é semelhante à chamada `str.format` usando `locals()`, por +exemplo, `f'{element} {count}` é equivalmente à `'{element} +{count}'.format(locals())` (desde que você tenha `element` e `count` como +variáveis locais da sua função). +{% end %} + +O último ponto é o mais importante. Você pode pensar "mas ainda tem três anos +até lá", mas natal está chegando, daqui a pouco é carnaval e, quando menos se +espera, é 2020. + +## O caminho para Python 3 + +Quem quiser já começar a portar seus aplicativos para Python 3, existem duas +formas: + +A primeira é executar seus aplicativos com `python -3 [script]`; isso irá fazer +com que o interpretador Python avise quando qualquer instrução de código que +ele não consiga converter corretamente seja alertado. Eu executei um script +pessoal [com data de 2003](https://bitbucket.org/juliobiason/pyttracker) e o +Python não apresentou nada. + +{% note() %} +Apenas para fins de melhor elucidação: o código que eu estava gerando já estava +mais correto e seguindo os padrões mais pythônicos; em 2014 eu ainda estava +vendo casos em que código rodando em Python 2.6 ainda usava `has_keys()`, que +foi deprecado no Python 2.3. +{% end %} + +Existem vários motivos pra isso: + +1. As pessoas se acostumaram a escrever código "Pythonico"; a linguagem em si + não sofreu grandes alterações. +2. Apesar da linguagem Python ter algumas coisas removidas, essas foram + lentamente reintroduzidas na linguagem; um exemplo é o operador de + interpolação de strings (`%`) que havia sido removido em favor do + `str.format` mas acabou voltando. + +A segunda forma para portar seu código para Python 3 é usar a ferramenta +`2to3`. Ela irá verificar as alterações conhecidas para Python 3 (por exemplo, +a transformação de `print` para função, a alteração de alguns pacotes da STL) +e ira apresentar um patch para ser aplicado depois. + +Entre as conversões que o `2to3` irá fazer, está a troca de chamadas de +`iter`-alguma-coisa para a versão sem o prefixo (por exemplo, +`iteritems()` irá se tornar simplesmente `items()`); `print` será +convertido para função; serão feitos vários ajustes nas chamadas das +bibliotecas `urllib` e `urlparse` (estas duas foram agrupadas no Python 3 +e a primeira teve várias reorganizações internas); `xrange` passa a ser +`range`; `raw_input` agora se chama `input` e tem um novo tratamento de +saída, entre outros. + +Existe apenas um pequeno problema nessa conversão de Python 2 para Python 3: +Como pode ser visto na lista acima, alguns comandos existem nas duas versões, +mas com funcionalidades diferentes; por exemplo, `iteritems()` é convertido +para simplesmente `items()`, mas os dois métodos existem em Python 2: o +primeiro retorna um iterador e o segundo retorna uma nova lista com as tuplas +de todos os elementos do dicionário (no caso do Python 3, é retornado um +iterador). Assim, apesar do código ser gramaticalmente igual tanto em Python 2 +quanto Python 3, semanticamente os dois são diferentes. + +Esse problema de "comandos iguais com resultados diferentes" pode ser um +grande problema se o sistema está sendo executado em ambientes que não +permitem modificação fácil -- por exemplo, o mesmo é executando num Centos 4 +ou ainda necessita compabilidade com Python 2.6, ambos "problemas" sendo, na +verdade, requisitos do grupo de infraestrutura. + +## Six (e `__future__`) ao Resgate + +Para resolver o problema de termos código que precisa executar nas duas +versões, existe a biblioteca [Six](https://pythonhosted.org/six/); ela faz o +"meio de campo" entre Python 2 e Python 3 e fornece uma interface para que +código Python 2 seja portado para Python 3 mantendo a compatibilidade. + +Num exemplo (relativamente idiota): + +```python +import collections + +class Model(object): + def __init__(self, word): + self._count = None + self.word = word + return + + @property + def word(self): + return self._word + + @word.setter + def word(self, word): + self._word = word + self._count = collections.Counter(word) + + @property + def letters(self): + return self._count + + def __getitem__(self, pos): + return self._count[pos] + +if __name__ == "__main__": + word = Model('This is an ex-parrot') + for letter, count in word.letters.iteritems(): + print letter, count +``` + +Nesse exemplo, temos uma classe que guarda uma frase e a quantidade de vezes +que cada letra aparece, utilizando `Counter` para fazer isso (já que `Counter` +conta a quantidade de vezes que um elemento aparece em um iterável e strings +*são* iteráveis). + +Nesse exemplo, temos os seguintes problemas: + +1. `class Model(object)`: em Python 3, todas as classes são "new class" e o + uso do `object` não é mais necessário (mas não afeta o funcionamento da + classe); + +2. `for letter, count in word.letter.iteritems()` Conforme discutido + anteriormente, `iteritems()` deixou de existir e passou a ser `items()`; + `items()` existe no Python 2, mas a funcionalidade é diferente. No nosso + caso aqui, o resultado da operação continua sendo o mesmo, mas o consumo de + memória irá subir cada vez que a chamada for feita. + +3. `print leter, count`: `print` agora é uma função e funciona levemente + diferente da versão com Python 2. + +Então, para deixar esse código compatível com Python 2 e Python 3 ao mesmo +tempo, temos que fazer o seguinte: + +> `class Model(object)` + +Não é preciso fazer nada. + +> `print letter, count` + +```python +from __future__ import print_function +print('{} {}'.format(letter, count)) +``` + +`print` como função pode ser "trazido do futuro" usando o módulo +`__future__` (apenas disponível para Python 2.7); como a apresentação de +várias variáveis não é recomenando usando-se vírgulas, usar o +`str.format` é a forma recomendada. + +Uma opção melhor (na minha opinião) é: + +```python +from __future__ import print_function +print('{letter} {count}'.format(letter=letter + count=count)) +``` + +Assim, os parâmetros usados na saída são nomeados e podem ser alterados. +Isto gera um erro estranho quando um nome usado na string de formato não +for passada na lista de parâmetros do format, mas em strings mais +complexas, o resultado é mais fácil de ser entendido (por exemplo, eu acho +mais fácil entender `{letters} aparece {count} vezes` do que `{} aparece {} +vezes`; ainda, é possível mudar a ordem das variáveis na string de formato +sem precisar alterar a ordem na lista de parâmetros). + +Uma opção melhor ainda é: + +```python +import six +six.print_('{letter} {count}'.format(letter=letter, + count=count)) +``` + +Com Six, remove-se a dependência com `__future__` e assim pode-se usar o +mesmo código em Python 2.6. + +> `for letter, count in word.letters.iteritems():` + +```python +import six +for letter, count in six.iteritems(word.letters): +``` + +Six provê uma interface unificada para iterador de itens tanto em Python 2 +quanto Python 3: `six.iteritems()` irá chamada `iteritems()` se estiver +rodando em Python e `items()` se estiver rodando com Python 3. + +E, assim, nosso código relativamente idiota agora é compatível com Python 2 e +Python 3 roda de forma idêntica nos dois. + +Mas vamos para um exemplo real: + +```python +import urllib +import urlparse + +def add_querystring(url, querystring, value): + frags = list(urlparse.urlsplit(url)) + query = frags[3] + query_frags = urlparse.parse_qsl(query) + query_frags.append((querystring, value)) + frags[3] = urllib.urlencode(query_frags) + return urlparse.urlunsplit(frags) + +if __name__ == "__main__": + print add_querystring('http://python.org', 'doc', 'urllib') + print add_querystring('http://python.org?doc=urllib', + 'page', '2') +``` + +{% note() %} +Sim, sim, o código poderia ser um simples "verificar se tem uma interrogação na +URL; se tiver, adicionar `&` e a query string; se não tiver, adicionar `?` e a +query string". A questão é: dessa forma, eu consigo fazer uma solução que vai +aceitar qualquer URL, em qualquer formato, com qualquer coisa no meio porque as +bibliotecas do STL do Python vão me garantir que a mesma vai ser parseada +corretamente. +{% end %} + +Esse é um código de uma função utilizada para adicionar uma query string em +uma URL. O problema com essa função é que tanto `urlib` +quanto `urlparse` sofreram grandes modificações, ficando, inclusive, sob o +mesmo módulo (agora é tudo `urllib.parse`). + +Para fazer esse código ficar compatível com Python 2 e 3 ao mesmo tempo, é +preciso usar o módulo `six.moves`, que contém todas essas mudanças de escopo +das bibliotecas da STL (incluindo, nesse caso, a `urllib` e `urlparse`). + +```python +import six + +def add_querystring(url, querystring, value): + frags = list(six.moves.urllib.parse.urlsplit(url)) + query = frags[3] + query_frags = six.moves.urllib.parse.parse_qsl(query) + query_frags.append((querystring, value)) + frags[3] = six.moves.urllib.parse.urlencode(query_frags) + return six.moves.urllib.parse.urlunsplit(frags) + +if __name__ == "__main__": + six.print_(add_querystring('http://python.org', 'doc', 'urllib')) + six.print_(add_querystring('http://python.org?doc=urllib', + 'page', '2')) +``` + +O que foi feito, aqui, foi usar `six.moves.urllib.parse`. Essa estrutura não +vêm por acaso: no Python 3, as funções de `urlparse` agora se encontram em +`urllib.parse`; Six assumiu que a localização correta para as funções dentro +"de si mesma" seriam os pacotes utilizados no Python 3. + +E, assim, temos dois exemplos de programas que conseguem rodar de forma igual +tanto em Python 3 quanto Python 2. + +Ainda, fica a dica: Se houver algum software que você utiliza que não roda +corretamente com Python 3, utilizar o Six pode ajudar a manter o código atual +até que uma escrita resolva o problema. + +## Outras Perguntas + +### Como fica a questão de ficar sempre com o Six? + +Boa parte das aplicações hoje botaram uma "quebra" do suporte às suas versões +que rodam em Python 2. Por exemplo, Django anunciou que em 2020 vai sair a +versão 2.0 do framework e essa versão vai suportar Python 3 apenas. + +## Quão difícil é portar para Python 3? + +Não muito difícil -- agora. Muitas das coisas que foram removidas que davam dor +de cabeça na conversão retornaram; o caso mais clássico é o que operador de +interpolação de strings `%`, que foi removido e teria que ser substituído por +`str.format`, mas acabou retornando. Outro motivo é que os scripts são mais +"pythônicos" atualmente, muito por causa de gente como [Raymond +Hettinger](https://rhettinger.wordpress.com/), que tem feito vídeos excelentes +de como escrever código em Python com Python (ou seja, código "pythônico"). E, +como anedota pessoal, eu posso comentar que meu código de 2003 rodou com +`python -3` sem levantar nenhum warning. diff --git a/content/the-day-i-found-my-old-code.md b/content/the-day-i-found-my-old-code.md new file mode 100644 index 0000000..0244eb2 --- /dev/null +++ b/content/the-day-i-found-my-old-code.md @@ -0,0 +1,102 @@ ++++ +title = "The Day I Found My Old Code" +date = 2015-12-18 +category = "code" + +[taxonomies] +tags = ["code", "python", "pep8", "pylint", "en-au"] ++++ + +Found a piece of code I wrote 2 years ago, following a lot of linters. I'm +amazed how readable the code still is. + + + +Today, walking across a client repository, I found a module I wrote two years +ago in Python. At the time, we lacked the knowledge to write proper tests, but +we used a lot of other tools: PEP8 and Pylint, mostly. + +Today-me is pissed with two-years-ago-me for the lack of tests, but where my +memory forgot the nuances of the project, the huge amount of comments and +proper documentation makes it for. + +For example, every pylint disable have an explanation about why it was +disabled: + +```python +# flask has a weird way to deal with extensions, which work fine but confuses +# the hell out of PyLint. +``` + +Related modules are loaded in sequence, with line breaks between different +sources: + +```python +from flask.ext.babel import Babel +from flask.ext.babel import refresh + +from flask.ext.gravatar import Gravatar + +from werkzeug.routing import NotFound +from werkzeug.routing import RequestRedirect +``` + +Every variable, every function, is documented in proper Sphinx format, which +contributes to understanding what the variable/function do: + +```python +#: Session duration time +#: The time is given as number and a time interval ("m" for minutes, "h" for +#: hours, "d" for days and "w" for weeks), e.g., "3d". A value of "None" will +#: make the session last till the user closes the browser. +SESSION_EXPIRATION = "1d" +``` + +```python +def reroute(route): + """Route control. The route must exist in the known routes list to return + a valid result; unknown routes will be redirected to the 404 page; if the + route exists but it's marked as "maintenance", the request will be + redirected to the 503 page.""" +``` + +Also, I found a class with a docstring of about 20 lines. It explains every +single parameter in its `__init__` function, which makes perfect sense when +you generate the documentation. + +Where the functions lacked a good name (due having a good name inside their +own objects/modules), a comment was added to explain what the function was +actually doing: + +```python +inject(current_app) # inject values if run stand-alone modules +load_routes(current_app) # load the routing information +register_filters() # register jinja filters +register_functions() # register jinja functions +register_tests() # register jinja tests +set_session_time() # define the cookie time +``` + +Also, I had the slight habit of putting large comments in the code when +something was kinda hacky: + +```python +# Now you're asking yourself: "Why heuristic find?" The reason is +# simple: in _function() , we add a new endpoint on top of one +# existing endpoint; because we do that on top of anything, we don't +# know, for sure, which one of the parameters the user (the other +# programmer, in this case) used in their URLs. So we need to through +# all parameters they expect to receive in their detail function in +# the hopes of finding something that actually matches a "pk". +``` + +It doesn't make much sense here, but believe me, it works. I was just reading +the code with a function called `heuristic_find` and I was "Man, which drugs I +took to call it 'heuristic_find'?" And BOOM, there it was why it was called +like that. + +Ok, honesty time: I wasn't the only one writing this code. But thanks to the +client input, I started and enforced all those rules (and wrote a huge part of +the base code), the code is still readable two years later. + +Yeah, I'm proud of it. diff --git a/content/the-sad-life-of-walter-mitty.md b/content/the-sad-life-of-walter-mitty.md new file mode 100644 index 0000000..11b9529 --- /dev/null +++ b/content/the-sad-life-of-walter-mitty.md @@ -0,0 +1,43 @@ ++++ +title = "The Sad Life of Walter Mitty" +date = 2015-03-28 +category = "thoughts" + +[taxonomies] +tags = ["movies", "the secret life of walter mitty", "rethink", "review", "en-au"] ++++ + +I once wrote about [The Secret Life of Walter +Mitty](http://juliobiason.net/2014/11/13/the-secret-life-of-walter-mitty-2013/) +and how nice story about a guy outgrowing his daydreams. + + + +But today I realized I see everything wrong. + +The second time I watched the movie, in the scene Walter talks to Todd (from +E-Harmony) on the top of the Himalayas, I thought "Well, that's one hell of a +mobile company, they have signal on the top of Himalayas". + +The third time I realized *how* the signal was that good: Walter never went to +the Himalayas. + +Let's assume that, in the story, Walter really went to Greenland and Iceland +and came back. And then he got fired by losing the negative #25. This is where +I believe everybody is tricked. At that point, Walter actually lost his only +connection to the real life (his job) and descend into a full time illusion. + +That's why the recluse Mitty went to Afghanistan and had to give cake to guys +with guns. That's how his call is crystal clear on the top of Himalayas. That's +why Tood, who never really knew Walter, went to the airport to rescue him. +That's how his E-Harmony profile suddenly was the hottest profile ever. That's +why the piano check was so large, so he wouldn't need to worry about his +unemployed life. That's why his mom saved the wallet. This how he finally +manages to face Ted. That's why Cheryl is right there when he gets his +severance check. And that's how his damn face appears on the cover of Life. +That even explains why Sean never took the picture -- if there is no picture, +there is nothing to show that the whole thing was a dream. + +When you pick the "he's in complete disconnection with reality and he lives in +his imagination now", the whole ending stops being a succession of lucky +happenings and starts to make sense. A sad sense, but a sense, nonetheless. diff --git a/content/when-i-used-pep8-to-fuck-up-code.md b/content/when-i-used-pep8-to-fuck-up-code.md new file mode 100644 index 0000000..5071ff1 --- /dev/null +++ b/content/when-i-used-pep8-to-fuck-up-code.md @@ -0,0 +1,66 @@ ++++ +title = "When I Used PEP8 To Fuck Up Code" +date = 2016-07-19 + +category = "code" + +[taxonomies] +tags = ["python", "pep8", "readability", "en-au"] ++++ + +We "inherited" some Python code recently. Although another team was working on +it, we now should support it and keep it going. The previous team at least +tried to use Pylint and follow PEP8. And I say "tried" because their +`pylintrc` has a couple of exceptions and their PEP8 extended the +maximum column to 100. + + + +{% note () %} +Pylint exceptions are almost common case these days, specially in +a Django project. But plain, pure `pylintrc` exclusion without giving any +pointers on *why* you're adding that exception are dumb, IMHO. I had a +project were we decided to add pylint exceptions inside the code, but for +every exception there should be a comment preceeding it explaining the +reason for the exception ("the framework doesn't expose this directly", +"pylint can't see this variable, but it is there", "It's the common place +to name the variable this way" and so on). +{% end %} + +Quick pause here 'cause I know a bunch of people will complain with a "But +monitors these days are very large and you don't need to focus on column 80; +we don't use CGA anymore, old person!". The thing about the maximum column at +80 is *not* about "being visible on every CGA" but actually a measure of +readability: If you speak shorter, concise sentences, people will get the idea +quickly; if you keep an stream of words non-stop without reaching a conclusion +and without any punctuation to keep the ideas flowing, you will end up with +something that it is easier to forget and which the central idea will be lost +(and I freaking hope you got what I just did). It's tiring to read a very long +sentence; it's easy to keep the context on a short sentence. + +In the spirit of "proper" PEP8, I reformatted one of the failing tests +to follow the 80 column limit. And now the code looks like crap. And +I'll commit like that. It's not because I hate my coworkers, but to point out +that, because it's a pain to read, it means the structured of the code is too +complex. If someone comes and say "damn, this test is hard to read", I'll be +able to point that it is not the test that it is hard to read, but the code +that reached the point where its complexity is leaking to the test code; it is +now a good time to refactor this to simplify things and make them easier to +read. + +{% note() %} +Actually, the reason for it to fail is too damn fun and worth a proper blog +post about it. Stay tunned! +{% end %} + +Not that we can simply stop working and fix the damn architecture of it, but we +can at least keep this beast around till everybody gets pissed and realize it +*desperately needs* a refactor. + +{% note() %} +Weird thing, people usually assume some countries are the center of bad code; +this baseline is coming from a "first world country" and, heck, it has one of +the worst designs I ever saw. I'll not name names here to protected the (maybe) +innocent. But in the second week of training, I realized this whole project +has, at least, 6 months of technical debt already. +{% end %}