Julio Biason
5 years ago
17 changed files with 534 additions and 5 deletions
@ -0,0 +1,57 @@ |
|||||||
|
+++ |
||||||
|
title = "Things I Learnt The Hard Way - Cognitive Cost Is The Readability Killer" |
||||||
|
date = 2019-06-26 |
||||||
|
|
||||||
|
[taxonomies] |
||||||
|
tags = ["en-au", "books", "things i learnt", "cognitive dissonance", "cognitive cost"] |
||||||
|
+++ |
||||||
|
|
||||||
|
"[Cognitive dissonance](https://en.wikipedia.org/wiki/Cognitive_dissonance)" |
||||||
|
is a fancy way of saying "I need to remember two (or more) different and |
||||||
|
contradicting things at the same time to understand this." Keeping those |
||||||
|
different things in your head creates a cost and it keeps accumulating the |
||||||
|
more indirect the things are ('cause you'll have to keep all those in your |
||||||
|
head). |
||||||
|
|
||||||
|
<!-- more --> |
||||||
|
|
||||||
|
(Disclaimer: I like to use the expression "cognitive dissonance" to make me |
||||||
|
sound smarter. I usually explain what it means, though.) |
||||||
|
|
||||||
|
To give you an example of a (very mild) cognitive cost, I'll show you this: |
||||||
|
|
||||||
|
* You have a function called `sum()`. It does the sum of the numbers of a |
||||||
|
list. |
||||||
|
* You have another function, called `is_pred()`. It gets a value and, if it |
||||||
|
fits the predicate -- a test, basically -- returns True; otherwise, |
||||||
|
returns False. |
||||||
|
|
||||||
|
So, pretty simple, right? One function sums numbers and another returns a |
||||||
|
boolean. |
||||||
|
|
||||||
|
Now, what would you say if I shown you this, in Python: |
||||||
|
|
||||||
|
```python |
||||||
|
sum(is_pred(x) for x in my_list) |
||||||
|
``` |
||||||
|
|
||||||
|
Wait, didn't I say that `sum()` sums numbers? And that `is_pred()` returns a |
||||||
|
boolean? How can I sum booleans? What's the expected result of True + True + |
||||||
|
False? |
||||||
|
|
||||||
|
Sadly, this works. Because someone, long time ago, didn't think booleans were |
||||||
|
worth a thing and used an integer instead. And everyone else since then did |
||||||
|
the same stupid mistake. |
||||||
|
|
||||||
|
But, for you, you'll now read a line that says "summing a boolean list returns |
||||||
|
a number". And that's two different, disparate things that you suddenly have |
||||||
|
to keep in mind when reading that line. |
||||||
|
|
||||||
|
That's why [types are important](/books/things-i-learnt/data-types) are |
||||||
|
important. Also, this may sound a bit like [the magical number |
||||||
|
seven](/books/things-i-learnt/magical-number-seven), 'cause you have to keep |
||||||
|
two things at your mind at the same thing but, although that's not near seven, |
||||||
|
they are not the same, with opposite (for weird meanings of "opposite", in this |
||||||
|
case) meanings. |
||||||
|
|
||||||
|
{{ chapters(prev_chapter_link="/books/things-i-learnt/magical-number-seven", prev_chapter_title="The Magic Number Seven, Plus Or Minus Two", next_chapter_link="/books/things-i-learnt/functional-programming", next_chapter_title="Learn The Basics of Functional Programming") }} |
@ -0,0 +1,35 @@ |
|||||||
|
+++ |
||||||
|
title = "Things I Learnt The Hard Way - Thinking Data Flow Beats Patterns" |
||||||
|
date = 2019-06-26 |
||||||
|
|
||||||
|
[taxonomies] |
||||||
|
tags = ["en-au", "books", "things i learnt", "data flow", "design patterns"] |
||||||
|
+++ |
||||||
|
|
||||||
|
When you're trying to find a solution to your problem, think on the way the |
||||||
|
data will flow through your code. |
||||||
|
|
||||||
|
<!-- more --> |
||||||
|
|
||||||
|
Instead of focusing on design patterns, a better way is to think the way the |
||||||
|
data will flow -- and be transformed -- on your code. |
||||||
|
|
||||||
|
For example, the user will input a number. You'll get this number and find the |
||||||
|
respective record on the database. This is a transformation -- no, it's not |
||||||
|
"I'll get the number and receive a complete different thing based upon it", |
||||||
|
you're actually transforming the number into a record, using the database as a |
||||||
|
transformation. |
||||||
|
|
||||||
|
(Yes, I know, it's not that clear at the first glance, but you have to think |
||||||
|
that they are the same data with different representations.) |
||||||
|
|
||||||
|
Most of the time I did that, I managed to come with more clear design for my |
||||||
|
applications. I didn't even think about how many functions/classes it would be |
||||||
|
needed to do these kind of transformations, that was something I came up |
||||||
|
_after_ I could see the data flow. |
||||||
|
|
||||||
|
In a way, this way of thinking gets things more clear 'cause you have a list |
||||||
|
of steps of transformations you need to do, so you can write them one after |
||||||
|
another, which prevents a lot of bad code in the future. |
||||||
|
|
||||||
|
{{ chapters(prev_chapter_link="/books/things-i-learnt/patterns-not-solutions", prev_chapter_title="Design Patters Are Used to Name Solution, Not Find Them", next_chapter_link="/books/things-i-learnt/magical-number-seven", next_chapter_title="The Magic Number Seven, Plus Or Minus Two") }} |
@ -0,0 +1,68 @@ |
|||||||
|
+++ |
||||||
|
title = "Things I Learnt The Hard Way - Learn The Basics of Functional Programming" |
||||||
|
date = 2019-06-26 |
||||||
|
|
||||||
|
[taxonomies] |
||||||
|
tags = ["en-au", "books", "things i learnt", "functional programming"] |
||||||
|
+++ |
||||||
|
|
||||||
|
At this point, you should at least have hard about how cool functional |
||||||
|
programming is. There are a lot of concepts here, but at least the very basic |
||||||
|
ones you should keep in mind. |
||||||
|
|
||||||
|
<!-- more --> |
||||||
|
|
||||||
|
A lot of talks about functional programming come with weird words like |
||||||
|
"functors" and "monads". It doesn't hurt to know what they really mean |
||||||
|
(disclaimer: I still don't). But some other stuff coming from functional |
||||||
|
programming is actually easy to understand and grasp. |
||||||
|
|
||||||
|
For example, immutability. This means that all your data can't change once |
||||||
|
it's created. You have a record with user information and the user changed |
||||||
|
this password? No, do not change the password field, create a new user record |
||||||
|
with the updated password and discard the old one. Sure, it creates a lot of |
||||||
|
create and destroy sequences which makes absolutely no sense (why would you |
||||||
|
allocate memory for a new user, copy everything from the old one to the new |
||||||
|
one, update one field, and deallocate the memory from the old one? It makes no |
||||||
|
sense!) but, in the long run, it would prevent weird results, specially when |
||||||
|
you understand and start use threads. |
||||||
|
|
||||||
|
(Basically, you're avoiding a shared state -- the memory -- between parts of |
||||||
|
your code.) |
||||||
|
|
||||||
|
Another useful concept is pure functions. Pure functions are functions that, |
||||||
|
called with the same parameters, always return the same result, no matter how |
||||||
|
many times you call them. One example of a _non_ pure function is `random()`: |
||||||
|
each time you call `random()`, you get a different number[^1]. An example of a |
||||||
|
pure function would be something like this in Python: |
||||||
|
|
||||||
|
```python |
||||||
|
def mult(x): |
||||||
|
return x * 4 |
||||||
|
``` |
||||||
|
|
||||||
|
No matter how many times you call `mult(2)`, it will always return 8. Another |
||||||
|
example could be our immutable password change above: You could easily write a |
||||||
|
function that receives a user record and returns a new user record with the |
||||||
|
password changed. You could call with the same record over and over again and |
||||||
|
it will always return the same resulting record. |
||||||
|
|
||||||
|
Pure functions are useful 'cause they are, first most, easy to test. |
||||||
|
|
||||||
|
Second, they are easy to chain, specially in a [data |
||||||
|
flow](/books/things-i-learnt/data-flow) design: Because they don't have an |
||||||
|
internal state (which is the real reason they are called pure functions), you |
||||||
|
can easily call one after the other and no matter how many times you pass |
||||||
|
things around, they still produce the same result. And because each function, |
||||||
|
given the same input, produce the same result, chaining them all _also_ |
||||||
|
produces the same result given the same inputs. |
||||||
|
|
||||||
|
Just those two concepts can make code longer (again, you're creating a new |
||||||
|
user record instead of simply changing one field), but the final result is a |
||||||
|
more robust code. |
||||||
|
|
||||||
|
[^1]: Except in Haskell, but it does require sending the seed every time, so |
||||||
|
you end up with random values based on the seed, so even there it is a pure |
||||||
|
function. |
||||||
|
|
||||||
|
{{ chapters(prev_chapter_link="/books/things-i-learnt/cognitive-cost", prev_chapter_title="Cognitive Cost Is The Readability Killer", next_chapter_link="/books/things-i-learnt/integration-tests", next_chapter_title="Unit Tests Are Good, Integration Tests Are Gooder") }} |
@ -0,0 +1,92 @@ |
|||||||
|
+++ |
||||||
|
title = "Things I Learnt The Hard Way - The Magical Number Seven, Plus Or Minus Two" |
||||||
|
date = 2019-06-26 |
||||||
|
|
||||||
|
[taxonomies] |
||||||
|
tags = ["en-au", "books", "things i learnt", "complexity"] |
||||||
|
+++ |
||||||
|
|
||||||
|
"[The magical number](https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two)" |
||||||
|
is a psychology article about the number of things one can keep in their mind |
||||||
|
at the same time. |
||||||
|
|
||||||
|
<!-- more --> |
||||||
|
|
||||||
|
I've seen twice this weird construction on where a function would do some |
||||||
|
processing, but its return value was the return of a second function and |
||||||
|
some bit of processing. Nothing major. But the second function would also do |
||||||
|
some processing and call a third function. And the third function would call a |
||||||
|
fourth. And the fourth a fifth. And the fifth, a sixth function. |
||||||
|
|
||||||
|
Something like this |
||||||
|
|
||||||
|
``` |
||||||
|
func_1 |
||||||
|
+-- func_2 |
||||||
|
+-- func_3 |
||||||
|
+-- func_4 |
||||||
|
+-- func_5 |
||||||
|
+-- func6 |
||||||
|
``` |
||||||
|
|
||||||
|
Now, when you're trying to understand this kind of code to find a problem, |
||||||
|
you'll have to keep in mind what the first, second, third, fourth, fifth and |
||||||
|
sixth functions do, 'cause they are all calling each other (inside them). |
||||||
|
|
||||||
|
This causes some serious mental overflow that shouldn't be necessary. |
||||||
|
|
||||||
|
Not only that, but imagine that you put a log before and after `func_1`: The |
||||||
|
log before points the data that's being send to func_1, and the log after its |
||||||
|
result. |
||||||
|
|
||||||
|
So you'd end up with the impression that `func_1` does a lot of stuff, when it |
||||||
|
actually is passing the transformation along. |
||||||
|
|
||||||
|
(I got a weird experience with a function called `expand`, which logging |
||||||
|
before the call would show some raw, compressed data, but the after was not |
||||||
|
the expanded data, but actually a list of already processed data from the |
||||||
|
compressed data.) |
||||||
|
|
||||||
|
What would be a better solution, you may ask? |
||||||
|
|
||||||
|
Well, if instead of making `func_1` call `func_2`, you can make it return the |
||||||
|
result (which may not be the final result, anyway) and _then_ call `func_2` |
||||||
|
with that result. |
||||||
|
|
||||||
|
Something like: |
||||||
|
|
||||||
|
``` |
||||||
|
result1 = func_1 |
||||||
|
result2 = func_2(result1) |
||||||
|
result3 = func_3(result2) |
||||||
|
result4 = func_4(result3) |
||||||
|
result5 = func_5(result4) |
||||||
|
result6 = func_6(result5) |
||||||
|
result7 = func_7(result6) |
||||||
|
``` |
||||||
|
|
||||||
|
Now you can see _exactly_ how the data is being transfomed -- and, obviously, |
||||||
|
the functions would have better names, like `expand`, `break_lines`, |
||||||
|
`name_fields` and so on, so you can see that that compressed data I mentioned |
||||||
|
before is actually being decompressed, the content is being broke line by |
||||||
|
line, the lines are getting names in its fields and so on (and one could even |
||||||
|
claim that it would make things clear if there was a function after |
||||||
|
`break_lines` which would just `break_fields`, which would make `name_fields` |
||||||
|
more obvious -- and in a construction like this it would be almost trivial to |
||||||
|
add this additional step). |
||||||
|
|
||||||
|
"But that isn't performant!" someone may cry. Well, maybe it's just a bit less |
||||||
|
performant than the original chained-calls ('cause it wouldn't create and |
||||||
|
destroy frames in the stack, it would just pile them up and then unstack them |
||||||
|
all in the end), but heck, optimization is for compilers, not people. Your job |
||||||
|
is to make the code _readable_ and _understandable_. If you need performance, |
||||||
|
you can think of a better sequence of steps, not some "let's make this a mess |
||||||
|
to read" solution. |
||||||
|
|
||||||
|
Just a quick note: Although the famous paper mentions that the number is |
||||||
|
around 7, new research is actually pointing that the number is way lower than |
||||||
|
that, at 4. So simply making `func_1` call `func_2`, which would call |
||||||
|
`func_3`, which would call `func_4` may be enough to overload someone and make |
||||||
|
them lose the track of what the code does. |
||||||
|
|
||||||
|
{{ chapters(prev_chapter_link="/books/things-i-learnt/data-flow", prev_chapter_title="The Magic Number Seven, Plus Or Minus Two", next_chapter_link="/books/things-i-learnt/cognitive-cost", next_chapter_title="Cognitive Cost Is The Readability Killer") }} |
@ -0,0 +1,35 @@ |
|||||||
|
+++ |
||||||
|
title = "Things I Learnt The Hard Way - Don't Mess With Things Outside Your Project" |
||||||
|
date = 2019-06-25 |
||||||
|
|
||||||
|
[taxonomies] |
||||||
|
tags = ["en-au", "books", "things i learnt", "frameworks"] |
||||||
|
+++ |
||||||
|
|
||||||
|
Simple rule: Is the code yours or from your team? Good, go break it. Does it |
||||||
|
come from outside? DON'T. TOUCH. IT. |
||||||
|
|
||||||
|
<!-- more --> |
||||||
|
|
||||||
|
Sometimes people are tempted to, instead of using the proper extension tools, |
||||||
|
change external libraries/frameworks -- for example, making changes directly |
||||||
|
into WordPress or Django. Believe me, I've seen my fair share of this kind of |
||||||
|
stuff going around. |
||||||
|
|
||||||
|
This is an easy way to make the project -- the team project, that is -- |
||||||
|
a huge security problem. As soon as a new version is released, you'll -- or, |
||||||
|
better yet, someone who was not the person who decided to mess with outside |
||||||
|
code -- have to keep up your changes in sync with the main project and, pretty |
||||||
|
soon, you'll find that the changes don't apply anymore and you'll leave the |
||||||
|
external project in an old version, full of security bugs. |
||||||
|
|
||||||
|
Not only you'd end up with something that may very soon put at risk your whole |
||||||
|
infrastructure, you won't take any benefits from things in the new versions, |
||||||
|
'cause hey, you're stuck in the broken version! |
||||||
|
|
||||||
|
Sometimes doing it so is faster and cheaper, and if you would do the same |
||||||
|
thing using extensions or actually coding around the problem, even duplicating |
||||||
|
the framework functions, would probably take longer and make you write more |
||||||
|
code, but in the long run, it's worth the time. |
||||||
|
|
||||||
|
{{ chapters(prev_chapter_link="/books/things-i-learnt/use-structures", prev_chapter_title="If Your Data Has a Schema, Use a Structure", next_chapter_link="/books/things-i-learnt/resist-easy", next_chapter_title="Resist The Temptation Of Easy") }} |
@ -0,0 +1,38 @@ |
|||||||
|
+++ |
||||||
|
title = "Things I Learnt The Hard Way - Design Patters Are Used to Name Solution, Not Find Them" |
||||||
|
date = 2019-06-25 |
||||||
|
|
||||||
|
[taxonomies] |
||||||
|
tags = ["en-au", "books", "things i learnt", "design patterns"] |
||||||
|
+++ |
||||||
|
|
||||||
|
Most of the times I saw design patterns being applied, they were applied as a |
||||||
|
way to find a solution, so you end up twisting a solution -- and, sometimes, |
||||||
|
the problem it self -- to fit the pattern. |
||||||
|
|
||||||
|
<!-- more --> |
||||||
|
|
||||||
|
My guess is that the heavy use of "let's apply _this_ design pattern" before |
||||||
|
even understanding the problem -- or even trying to solve it -- comes as a |
||||||
|
form of [cargo cult](/books/things-i-learnt/cargo-cult): I heard people used |
||||||
|
this pattern and solved their problem, so let's use it too and it will solve |
||||||
|
our problem. Or, worse: Design pattern is described by _Famous Person_, so we |
||||||
|
must use it. |
||||||
|
|
||||||
|
Here is the thing: Design pattern should _not_ be used as a way to find |
||||||
|
solution to any problems. You may use some of them as base for your solution, |
||||||
|
but you must focus on the _problem_, not the _pattern_. |
||||||
|
|
||||||
|
"Do a visitor pattern will solve this?" is the wrong question. "What should we |
||||||
|
do to solve our problem?" is the real question. Once you went there and solved |
||||||
|
the problem you may look and see if it is a visitor pattern -- or whatever |
||||||
|
pattern. If it doesn't, that's alright, 'cause you _solved the problem_. If it |
||||||
|
did... well, congratulations, you now know how to name your solution. |
||||||
|
|
||||||
|
I've seen this happening a lot: People have a problem; people decided to use a |
||||||
|
pattern; the pattern doesn't actually solve the problem (not in the 100% mark, |
||||||
|
but above 50%); what happens then is that people start twisting the problem to |
||||||
|
fit the pattern or, worse, add new layers to transform the problem into the |
||||||
|
pattern. |
||||||
|
|
||||||
|
{{ chapters(prev_chapter_link="/books/things-i-learnt/gherkin", prev_chapter_title="Gherkin Is Your Friend to Understand Expectations", next_chapter_link="/books/things-i-learnt/data-flow", next_chapter_title="Thinking Data Flow Beats Patterns") }} |
@ -0,0 +1,31 @@ |
|||||||
|
+++ |
||||||
|
title = "Things I Learnt The Hard Way - Resist The Temptation Of Easy" |
||||||
|
date = 2019-07-01 |
||||||
|
|
||||||
|
[taxonomies] |
||||||
|
tags = ["en-au", "books", "things i learnt", "ides"] |
||||||
|
+++ |
||||||
|
|
||||||
|
Sure that IDE will help you with a ton of autocomplete stuff and let you |
||||||
|
easily build your project, but do you understand what's going on? |
||||||
|
|
||||||
|
<!-- more --> |
||||||
|
|
||||||
|
I'm not denying the fact that IDEs make things easier. The fact is, you should |
||||||
|
not rely heavily on their features. |
||||||
|
|
||||||
|
I mentioned before that you should at least know how to [run tests on the |
||||||
|
command line](/books/things-i-learnt/tests-in-the-command-line) and the same |
||||||
|
applies to everything in IDEs: how to build, how to run, how to run tests and, |
||||||
|
let's be honest here, how to find proper names for your variables and |
||||||
|
functions. 'Cause, sure, it's nice that the IDE can complete all the names of |
||||||
|
the functions, but if the autocomplete feature was off, would you know which |
||||||
|
function you need? In other words, have you thought at least 10 seconds about |
||||||
|
a good name for your function so you _won't_ need to use autocomplete to |
||||||
|
remember its name? |
||||||
|
|
||||||
|
These days, IDEs can autocomplete almost everything, from function names to |
||||||
|
even how to name your variables. But using the autocomplete is not always a |
||||||
|
good solution. Finding better names is. |
||||||
|
|
||||||
|
{{ chapters(prev_chapter_link="/books/things-i-learnt/outside-project", prev_chapter_title="Don't Mess With Things Outside Your Project", next_chapter_link="/books/things-i-learnt/use-timezones", next_chapter_title="Always Use Timezones With Your Dates") }} |
@ -0,0 +1,24 @@ |
|||||||
|
+++ |
||||||
|
title = "Things I Learnt The Hard Way - \"Right Tool For The Job\" Is Just To Push An Agenda " |
||||||
|
date = 2019-06-25 |
||||||
|
|
||||||
|
[taxonomies] |
||||||
|
tags = ["en-au", "books", "things i learnt", "right tool", "agenda"] |
||||||
|
+++ |
||||||
|
|
||||||
|
A lot of times I heard "We should use the right tool for the job!" Most of |
||||||
|
those times it was just a way to push an agenda. |
||||||
|
|
||||||
|
<!-- more --> |
||||||
|
|
||||||
|
When someone claims we should use the "right tool", the sentence mean there is |
||||||
|
a right tool and a wrong tool to do something -- e.g., using a certain |
||||||
|
language/framework instead of the current language/framework. |
||||||
|
|
||||||
|
But sadly, none of those times it was really the "right tool". Most of the |
||||||
|
time, the person saying we should use the "right tool" was trying to push |
||||||
|
their own favourite language/framework, either because they disliked the |
||||||
|
current language/framework or because they don't want to push the "hero |
||||||
|
project". |
||||||
|
|
||||||
|
{{ chapters(prev_chapter_link="/books/things-i-learnt/cargo-cult", prev_chapter_title="Understand And Stay Away From Cargo Cult", next_chapter_link="/books/things-i-learnt/right-tool-obvious", next_chapter_title="The Right Tool Is More Obvious Than You Think") }} |
@ -0,0 +1,29 @@ |
|||||||
|
+++ |
||||||
|
title = "Things I Learnt The Hard Way - The Right Tool Is More Obvious Than You Think" |
||||||
|
date = 2019-06-25 |
||||||
|
|
||||||
|
[taxonomies] |
||||||
|
tags = ["en-au", "books", "things i learnt", "right tool"] |
||||||
|
+++ |
||||||
|
|
||||||
|
Maybe you're in a project that needs to process some text. Maybe you're |
||||||
|
tempted to say "Let's use Perl" 'cause you know that Perl is very strong in |
||||||
|
processing text. |
||||||
|
|
||||||
|
But that may still be not the right tool. |
||||||
|
|
||||||
|
<!-- more --> |
||||||
|
|
||||||
|
Although Perl is an amazing tool to process files, providing every single |
||||||
|
switch and option you'll ever need, you're missing something: You're working |
||||||
|
on a C shop. Everybody knows C, not Perl. |
||||||
|
|
||||||
|
Sure, if it is a small, "on the corner" kind of project, it's fine to be in |
||||||
|
Perl; if it is important for the company, it's better that if it is a C |
||||||
|
project. |
||||||
|
|
||||||
|
One of the reason your hero project may fail is because of this: You may even |
||||||
|
prove that what you thought it was a better solution is actually a better |
||||||
|
solution, but it can't be applied 'cause nobody else can maintain it. |
||||||
|
|
||||||
|
{{ chapters(prev_chapter_link="/books/things-i-learnt/right-tool-agenda", prev_chapter_title="Right Tool For The Job Is Just To Push An Agenda") }} |
@ -0,0 +1,40 @@ |
|||||||
|
+++ |
||||||
|
title = "Things I Learnt The Hard Way - Always Use Timezones With Your Dates" |
||||||
|
date = 2019-07-01 |
||||||
|
|
||||||
|
[taxonomies] |
||||||
|
tags = ["en-au", "books", "things i learnt", "dates", "timezones"] |
||||||
|
+++ |
||||||
|
|
||||||
|
No matter if the date you're receiving is in your local timezone and you'll |
||||||
|
display it in your timezone. Sooner or later, the fact that you ignored there |
||||||
|
was a timezone behind that date will hurt you. |
||||||
|
|
||||||
|
<!-- more --> |
||||||
|
|
||||||
|
(Note: Most of this post when I say "date" you can think of "date and time", |
||||||
|
although the date should also be timezone aware.) |
||||||
|
|
||||||
|
At some point of my professional life, ignoring timezones was easy: You just |
||||||
|
pick the date, throw in the database, then read it back and everybody was |
||||||
|
happy. |
||||||
|
|
||||||
|
But things are not like this anymore. People will access your site from far |
||||||
|
away locations, the source of the date may not be in the same timezone of your |
||||||
|
system, your system may be running in a completely different timezone of your |
||||||
|
dev machine (it's pretty common to run things in our machines in the local |
||||||
|
timezone but the production system will run in UTC), the display may be a |
||||||
|
complete different timezone than your production and dev machine and so on. |
||||||
|
|
||||||
|
So always carry the timezone with the data. Find modules/classes that support |
||||||
|
dates with timezones (a.k.a. make things _timezone aware_), capture the |
||||||
|
timezone as soon as possible and carry it around in all operations. |
||||||
|
Modules/classes that don't support timezones for dates/times should, as soon |
||||||
|
as possible, removed from the system. |
||||||
|
|
||||||
|
Developers a bit more seasoned -- and by "seasoned" I meant "Had to deal with |
||||||
|
times before" -- will probably claim "Hey, this is _obvious_!" And I'd have to |
||||||
|
agree. But it's annoying how many times I got bitten by some stupid bug 'cause |
||||||
|
we decided that "well, everything is in the same timezone, so it's all good". |
||||||
|
|
||||||
|
{{ chapters(prev_chapter_link="/books/things-i-learnt/resist-easy", prev_chapter_title="Resist The Temptation Of Easy", next_chapter_link="/books/things-i-learnt/utf-utf8", next_chapter_title="Always Use UTF-8 For Your Strings") }} |
@ -0,0 +1,55 @@ |
|||||||
|
+++ |
||||||
|
title = "Things I Learnt The Hard Way - Always Use UTF-8 For Your Strings" |
||||||
|
date = 2019-07-01 |
||||||
|
|
||||||
|
[taxonomies] |
||||||
|
tags = ["en-au", "books", "things i learnt", "utf-8"] |
||||||
|
+++ |
||||||
|
|
||||||
|
Long gone are the days where [ASCII](https://en.wikipedia.org/wiki/ASCII) was |
||||||
|
enough for everyone. Long gone are the days where you can deal with strings |
||||||
|
with no "weird" or "funny" characters. |
||||||
|
|
||||||
|
<!-- more --> |
||||||
|
|
||||||
|
I was born in a time when the only encoding we had was ASCII. You could encode |
||||||
|
all strings in sequences of bytes, 'cause all characters you could use where |
||||||
|
encoded from 1 to 255 (well, from 32 [space] to 93 [close brackets] and you |
||||||
|
still have a few latin-accented characters in some higher positions, although |
||||||
|
not all accents where there). |
||||||
|
|
||||||
|
Today, accepting characters beyond that is not the exception, but the norm. To |
||||||
|
cope with all that, we have things like |
||||||
|
[Unicode](https://en.wikipedia.org/wiki/Unicode) and |
||||||
|
[uTF-8](https://en.wikipedia.org/wiki/UTF-8) for encoding that in reasonable |
||||||
|
memory space (UTF-16 is also a good option here, but that would depend on your |
||||||
|
language). |
||||||
|
|
||||||
|
So, as much as you to make your system simple, you will have to keep the |
||||||
|
internal representation of your strings in UTF-8/UTF-16. Surely, you may not |
||||||
|
receive the data as UTF-8/UTF-16, but you'll have to encode it and keep |
||||||
|
transmitting it around as UTF-8/UTF-16 till you have to display it, at which |
||||||
|
point you'll convert from UTF-8/UTF-16 to whatever your display supports |
||||||
|
(maybe it even supports displaying in UTF-8/UTF-16, so you're good already). |
||||||
|
|
||||||
|
At this point, I believe most languages do support UTF-8, which is great. You |
||||||
|
may still have problems with inputs coming from other systems that are not |
||||||
|
UTF-8 (old Windows versions, for example), but that's fairly easy to convert |
||||||
|
-- the hard part is figuring out the input _encoding_, though. Also, most |
||||||
|
developers tend to ignore this and only accept ASCII characters, or ignore |
||||||
|
UTF-8/whatever-encoding and get a bunch of weird characters on their printing, |
||||||
|
'cause they completely ignored the conversion on the output point. That's why |
||||||
|
I'm repeating the mantra of UTF-8: To remind you to always capture your input, |
||||||
|
encode it in UTF-8 and _then_ convert in the output. |
||||||
|
|
||||||
|
One thing to keep in mind is that UTF-8 is not a "cost free" encoding as |
||||||
|
ASCII: While in ASCII to move to the 10th character, you'd just jump 10 bytes |
||||||
|
from the start of the string, with UTF-8 you can't, due some characters being |
||||||
|
encoded as two or more bytes (you should read the Wikipedia page; the encoding |
||||||
|
is pretty simple and makes a lot of sense) and, due this, you can't simply |
||||||
|
jump 10 characters 'cause you may end up in second byte that represents a |
||||||
|
single character. Walking through the whole string would require traversing |
||||||
|
the string character by character, instead of simply jumping straight to the |
||||||
|
proper position. But that's a price worth paying, in the long run. |
||||||
|
|
||||||
|
{{ chapters(prev_chapter_link="/books/things-i-learnt/use-timezones", prev_chapter_title="Always Use Timezones With Your Dates", next_chapter_link="/books/things-i-learnt/languages-are-more", next_chapter_title="A Language Is Much More Than A Language") }} |
Loading…
Reference in new issue