Do. Reflect. Learn. Repeat!

warnings.warn - some DeprecationWarning gotchas

Mon, 01 Feb 2016 11:17:00 CET

`DeprecationWarning`s

It's a good practice to gradually deprecate one's library's API, so that users get advance warning of coming changes. The built in way to do so is Python's warnings module

import warnings

if __name__ == '__main__':
    warnings.warn('deprecation', DeprecationWarning)

By default, they are not reported!

However, if I run this code, there is no output.

$ python3.5 demo.py
$ echo $?
0

The documentation on default warning filters explains:

By default, Python installs several warning filters, which can be overridden by the command-line options passed to -W and calls to filterwarnings().

DeprecationWarning and PendingDeprecationWarning, and ImportWarning are ignored.

BytesWarning is ignored unless the -b option is given once or twice; in this case this warning is either printed (-b) or turned into an exception (-bb).

ResourceWarning is ignored unless Python was built in debug mode.

Forcing defaults from the command line makes them reported

However, if we force the default warning behavior from the command line, we get the warnings - even though theoretically we only specified how often the warnings should be reported, not which warnings to be reported!

$ python3.5 -W d demo.py
demo.py:4: DeprecationWarning: deprecation
  warnings.warn('warning', DeprecationWarning)
$ echo $?
0

Before reading the docs, I suspected the operating system maintainers didn't want to 'spam' the end users of their systems with python library warnings while going about their daily tasks, but apparently this is built into python itself.

So how should I deprecate things?

I'm a big fan of executable documentation, but this might be one of those cases where good old fashioned documentation might be more effective, as we can't expect users of our library to run with warnings enabled.

I would still leave in these deprecations for the project's developers as well as for the pedant users of the library.

Checking against deprecations of our dependencies

That's easier, as we own that code. I would run my builds and tests with -W d at the very least, but I would like to try to run with -W error. Except that I don't want to fail the build if one of my dependencies is using deprecated apis, so probably I would just have a custom main.py where I would explicitly set and reset my warning filters. E.g.: updating the above demo code would give me the following:

import warnings
import os

if os.environ.get('TEST', '0') == '1':
    warnings.filterwarnings(module='.*', action='ignore')
    warnings.filterwarnings(module=__name__, action='error')

if __name__ == '__main__':
    warnings.warn('deprecation', DeprecationWarning)

$ TEST=1 python3.5 demo.py
Traceback (most recent call last):
  File "demo.py", line 8, in 
    warnings.warn('deprecation', DeprecationWarning)
DeprecationWarning: deprecation
$ echo $?
1

I yet have to test how feasible it is when our library supports multiple versions of a dependent library, e.g.: Django, but my gut feeling is that it should be doable

Beware of the ordering of warning filters

If in the above example we got the order reversed, then our own DeprecationWarnings would be ignored too!

if os.environ.get('TEST', '0') == '1':
    warnings.filterwarnings(module=__name__, action='error')
    warnings.filterwarnings(module='.*', action='ignore')

The post warnings.warn - some DeprecationWarning gotchas first appeared on http://blog.zsoldosp.eu.

A few thoughts on BDDon't

Wed, 09 Dec 2015 23:00:00 CET

Note: apparently I wrote this blog post, but forgot to publish it back in december. I will publish it nonetheless together with the python warnings post, and keep the publish/create date the original from last year.

Kevin Dishman wrote an article titled BDDon't, in which he raises a few objections against issues common with the tools and common practices of them. As recently at the Softwerkskammer Nuremberg user group we've just talked about Gherkin style testing after my polytesting talk, I'm now doubly prompted to blog some thoughts.

Context/Disclaimer
I agree with Kevin's problem statements
Despite that, I still like Gherkin tools
Reduce the pain for Gherkin
- Regular-expression mapping - Parameter variance
- Global state

Context/Disclaimer

I've only used behave, so it's entirely possible it has different behavior from the tool implementation(s) Kevin used
I don't run the behave tests on the selenium level
I have never worked for an airline, though flew enough times already. But to be consistent with Kevin's examples, I will use this domain to illustrate points
I don't exclusively use only BDD tests

I agree with Kevin's problem statements

Regular-expression mapping can become complex and error prone and should be avoided
Global state and step interdependencies are also error prone and should be avoided
If the business is really interested in participating in test specification, this would be a great opportunity for them to pair with someone familiar and comfortable with the test suites the team has created.

Despite that, I still like Gherkin tools

It forces separation of test, test glue, and app code

Writing readable tests are hard. While on a team with a great culture and strong discipline this isn't a challenge, on teams that are not yet experts in testing I find this constraint really helpful.

Gherkin docs can be useful beyond dev

As stated above, I've found that motivated stakeholders, BAs, etc. will be fine reading Java, Ruby, Python test code. However, other people can find value from an existing - and filterable/navigable (!) - Gherkin documents, e.g.: tech/customer support people could appreciate living documents. A new support person would need the same info as a new developer making changes in that area! Or imagine if your processes need to be audited - persuading an auditor to read natural text might be easier than getting them to navigate codebases.

Don't have to test things on the full stack level

Even if Gherkin is used for acceptance testing, some algorithms/calculations can be tested on the unit/component level, never even touching the UI to provide the confidence the developers (business) need(s). As time goes on, it's even possible to transition tests up and down the test pyramid (thanks to Stefan Clepce for bringing behave stages to my attention)!

Reduce the pain for Gherkin

Regular-expression mapping - Parameter variance

Different abstraction levels

The problem of complex test data isn't unique to BDD tools - integrated tests written in any tool will face the same problem. And the solutions are similar - in the tests where I care about the concrete values, I specify all the test values explicitly, and in tests where I don't care about those details and operate at a higher level abstraction, I create sample (random) data for that - a'la pytest fixtures

E.g.: instead of

Given I have booked 2 flights

I might say

Given I booked a valid flight
And I booked another valid flight

both these steps can match to the same step implementation - whether via regex, multiple methods calling the same actual implementation, or via some other way. Here I quite like behave's annotations, i.e.: I can say

@given('I booked a valid flight')
@given('I booked another valid flight')
def booked_a_valid_flight(context):
    ....

Or I might even test things in isolation, not even sharing step implementations - I might test that the OLTP model gets translated to the correct reporting model, and for the report I only test things on the report data model (see step tables below).

Relying on the application's own defaults

I have never booked a flight for an unaccompanied minor. Until reading Kevin's article, it hasn't even occurred to me that is a use case to consider, yet I've booked quite a number of flights.

I'm sure there is an option to specify that. However, that option by default is not check in the application for the checkout - the customer explicitly has to request it to be able to specify such a scenario.

Thus for such a scenario, I likely would include an explicit step for booking with an unaccompanied minor.

Using step tables (beware, this is defaults in disguise!)

Behave supports step table data

Given I have booked a flight
   | type                     | return |
   | with unaccompanied minor | true   |

technically, the above would need a header, but in the step implementation, the header could be treated as the first row, if that aids readability.

This can cause more complex test code, but if each step corresponds to an actual action in the application (a'la Page Object), then there should be no extra logic in the step handler - each line here corresponds to a form input (or HTTP POST) entry.

Global state

In behave, each step implementation method receives a Context object to store its state (changes) in,

This object is a place to store information related to the tests you're running. You may add arbitrary attributes to it of whatever value you need.

During the running of your tests the object will have additional layers of namespace added and removed automatically. There is a "root" namespace and additional namespaces for features and scenarios.

This is unlike it is implemented e.g.: in cucumber-jvm, and maybe that's why I haven't run into this issue (though external global state, such as a database can still be shared).

The post A few thoughts on BDDon't first appeared on http://blog.zsoldosp.eu.

Quick script to help reporting bugs for python

Wed, 11 Sep 2013 17:00:00 CEST

While poking around the testrepository package I ran into the cryptic error message of 'unicodeescape' codec can't decode bytes in position 56-57: truncated \uXXXX escape. I set out to reproduce the bug, but that is of course an iterative process, like anything else in coding, so I set out to script it. Since I expect I'll need this again, and someone else might need it too, I'm recording it here.

Note

I eventually figured out the problem was that the recommended default for testrepository has a different command line behavior from the built in unittest's runner:

python -m unittest discover bugrepro

testr run bugrepro doesn't get translated to the discover root, but into LISTOPT variable (python -m subunit.run discover . $LISTOPT $IDOPTION)

Sure, a nicer exception message would have been nice.

My Environment

While for serious development I use Linux VMs, for explorations/hobbies, I use the base Windows 7 on my command line from git-bash - it's enough for basic scripting things, plus I tend to use git anyway, and I don't like Powershell.

The script

#!/bin/sh
function d() {
    echo "\$ $*"
    $*
}

function win_info() {
    systeminfo | grep "\(OS Name\|OS Manufacturer\|System Type\|Locale\)"
}

REPRO_FOLDER=bugrepro
d win_info
d python --version
d pip freeze
d git --version
d grep ^ -nH `find $REPRO_FOLDER -name \*.py`
d python -m unittest discover $REPRO_FOLDER
d ls .testr* -l
d cat .testr.conf
d testr run $REPRO_FOLDER
d testr run

Running ./bugrepro.sh 2>&1 | tee bugrepro.txt > /dev/null produces the following output (cropped, you can see the full output here):

$ win_info
OS Name:                   Microsoft Windows 7 Professional 
OS Manufacturer:           Microsoft Corporation
System Type:               x64-based PC
System Locale:             en-us;English (United States)
Input Locale:              en-us;English (United States)
$ python --version
Python 2.7.4
$ pip freeze
extras==0.0.3

Things I learned

While this took somewhat longer than expected (and writing this post wasn't even planned!) and I haven't even reported the actual bug yet (Yak shaving...), but I don't mind - especially because I did all this while recovering from a nasty cold :)

for cmd.exe, the ver and systeminfo commands are pretty neat and there are more commands: type help
wrote my first blog post in reStructuredText since it's a better fit for including snippets (executable documentation is a pet peeve of mine!)

Open Questions (aka: do I want to shave further yaks?!):

cmd.exe /C doesn't seem to behave as one would expect it when invoked from git-bash (msysgit, 1.8.1) - it doesn't exit and the execution to continue requires an exit command!
I always want to do metaprogramming in bash - how could I display the body of a bash function? I'm thinking of something similar to what one does with alias
```
$ alias foo='echo foo'
$ foo
foo
$ alias foo
alias foo='echo foo'
```
is there a better way for passing arguments in bash? I ended up doing grep ^ because I went crazy trying to escape find... -exec...\;. and making the script use #!/bin/bash -x would be an overkill here, and I just want to echo back the command that was executed...

The post Quick script to help reporting bugs for python first appeared on http://blog.zsoldosp.eu.

Own your data (or why did I move away from Blogger and WordPress?)

Sun, 04 Aug 2013 14:25:00 CEST

This blog used to be two separate blogs, hosted at Blogger and WordPress.com, respectively. I've gone through some trouble to migrate their content, hopefully without breaking urls to this Blogofile based setup. In the process, I have lost a considerable number of features and conveniences - so why did I do this?

Owning my data and platform

As the saying goes, there is no such thing as a free lunch (or put more bluntly, if you are not paying for it, you are the product being sold). A prime example is WordPress.com, which reserves the right to display ads on your freely hosted blog, while Blogger probably enhances their advertisement profile of you - I don't know.

Even if these platforms don't do anything bad at the moment, they can pretty much change the features available to you, or any other aspects of their terms of service - remember that story about the Instagram ToS change regarding commercial use of your photos? Sure, it turned out to be a misunderstanding and/or they backed down, but theoretically they can do it.

Of course, this wasn't a concern when I started out with blogging, but certainly is something to consider now that I am nearing post #42.

These considerations are of course applicable for other service providers beyond blogging, e.g.:

(sports) tracking applications - while I have not yet gotten around to building up a website like Suzi & Ralf, but I know eventually I will want to create something with the trails of all the places I've been to, whether for fun or for an anniversary gift or similar - and most sites make it rather hard to export your data conveniently (endomondo is particularly annoying, so I'm real grateful for the easy zip-export of runkeeper!)
social networks - does LinkedIn, Facebook, Xing, etc. allow you to easily export your contacts and their contact details? I would be pretty upset to find myself without a personal copy of that data

Of course, I keep using services hosted by others, but I try to make sure I use one with a friendly data liberation policy!

Other considerations

backup - sure, it's almost a repeat of the prior point, but worth noting. It doesn't happen as often as before, but there is always the possibility of data loss or service outage
version control - being a software developer, this is almost second nature to me - it's incredibly liberating to be able to throw all my changes away and go back to a previous, known good version of a post draft.
offline authoring - I do a lot of my writing and hobby coding during my train commute, with spotty internet connection at best. Working locally on my laptop with my favorite text editor beats any online editor widget.
full customizations - sure, probably there is a WordPress plugin for anything I would want to do, but for a lot of the small checks, it takes longer to find, learn, and configure the one I need than to implement it in python - e.g.: checking the site for broken links, custom reports, etc. I should probably mention the html template customizations here too, though you might be able to tell that is not yet the highest priority for me :)

The post Own your data (or why did I move away from Blogger and WordPress?) first appeared on http://blog.zsoldosp.eu.

Book Review - Python Testing Cookbook by Greg L. Turnquist

Sun, 23 Oct 2011 13:24:00 CEST

I have been doing (developer) automated testing for years now, but I recently moved from .NET to Python. Recently, at one point I suggested to collegues that we try Concordion, only to learn that there is the doctest module that could be used to achieve similar result (more about that in a later post). Remembering my own advice: When In Rome, Do as the Romans Do, I set out looking for a Python specific book about testing - and the Python Testing Cookbook by Greg L. Turnquist book seemed to be a good fit based on reviews.

Overall, I liked the book, and it lived up to my expectations - it provided me with a list of tools and some sample code to get started with each of them.

Beware that it is an entry level book - if, like me, you are already familiar with the testing concepts, and are looking for a book to learn about advanced testing concepts, theories, this book might be too little for you (or just read through the "There is more" sections of the recipies). But it is great for someone new to testing - though discussions with (and code reviews by) someone experienced in testing should accompany the reading.

Despite the below criticisms, which are meant to be rather a companion to the book than an advice against it (i.e.: probably the only book I wouldn't find anything missing from and nothing to criticise about would be written for me, in real time, based on my immediate needs). The fact that the list is short shows how I found the rest of the book valuable, with great advices that go beyond the cookbook format (why you shouldn't use fail(), why there should be one assert per test, etc.). While I don't see eye to eye on each topic with the book, but just as the book is not written in a "my way or the highway" style, I will not get into minor differrences of opinion.

Format of the book

Each chapter explores a topic, with multiple specific recipes. Each recipe is relatively self contained, so if we are in a hurry and need to get to the solution of one specific problem without reading the whole book/chapter, it's possible. However, for those reading whole chapters, this results in a bit of repetition - I had to fight the urge to just gloss over the code I (thought) I had seen before.

Each recepie follows the format of

stating the problem

showing code that solves it

explaining how the code works

and finally, providing warnings about pitfalls and problems in the code, and some further advice

While this format is easy to follow, it has a few drawbacks.

until I got used to this style, I often found myself cursing out loud like the code reviewers in this comic while reading the code that will later be explained to be bad/antipattern.

each recipe has a lot of additional testing insight, potentially unrelated to the tool being demonstrated - but one can miss these, thinking "oh, I know all about doctest, I'll just skip that chapter"

for people in a hurry, just scanning the cookbook and copying (typing up) the code - there is nothing to indicate in the code that there is an antipattern there, only in the later paragrpahs - which you might not read when in a hurry. Just thinking about the examples where the unit tests have no asserts but only print statements gives me the shivers (and it's even used for illustration in the chapter about Continious Integration!).

What was missing from the book

About testing legacy code, I was missing two things:
- a pointer to Michael Feather's classic book, Working Effectively with Legacy Code
- a warning about a mistake I have seen almost everyone (myself included) make when getting started with testing legacy code: don't write tests just because you can - only add cases for the area you are working on and only add enough to cover your current needs. This is hinted at, but I've found it's important to state it explicitly.

Notes about test maintainability
- I strongly disagree with the approach of having one test class per physical class, and test methods directly excercising the class's method. I've found these can lead to maintainability problems down the road, so I prefer to introduce helper abstractions (e.g.: assert_roman_numeral_conversion(expected_result, roman_numeral_string) method) for most of my tests, and organize test methods by logical scenarios instead of mirroring code organizational units (on successful login, validating user input, etc.). These abstraction (indirections) makes it easier to update tests after an API change or refactoring. It might sound like an advanced topic, but I think it's a key concept for effective tests, and one that people should be exposed to early (just after they've made the mental jump from throwaway main methods with visual/human assertions to automated tests with automated assertions).
- Acceptance Testing - it is notoriously difficult for us programmers to write good acceptance tests that are both maintainable and readable by the customers. I'm rather sure that in the example given in the book, the customers would not be interested in knowing which html tag contains the price in the output.

Minor criticisms

there is an inconsistent level of detail and further pointers. E.g.: while optparse is explained in detail, virtualenv and setuptools are glossed over.

In addition to the assertless test methods, the other thing that shocked me was the example in the doctest module that - to illustrate that the test works - introduced a bug in the test code. While the fact that test is code and thus can be buggy should be emphasized, but that wasn't the case here. This could leave the reader wondering why exactly we introduced the bug in the test code - aren't we testing the application?

The book is careful not to fall into the trap of elitist speak that might alienate people, but saying that coupling and cohesiveness are subjective terms is just providing gunpowder to unwinnable arguments(*).

Interesting notes

This might be a cultural thing (I'm coming from .NET), but I've found it rather surprising that such an entry level book talks about test runners, and write custom test runners. It's useful knowledge, just something that I have not seen mentioned in so much detail in the Java/.NET world so early. Maybe it's got to do with IDEs being more widespread, where running a subset of the tests is easy.

As said, the book lives up to the expectations, so if you would like to get a quick and practical introduction to testing in pytohn - both tools and concepts, I can recommend this one for you.

(*) Reminds me of a story from long ago. The team in question has decided to use bind variables for all SQL invocations (I've said it's been some time ago) to prevent SQL Injection. The one programmer wrote a stored procedure that concatenated the SQL command in its body... and argued that this is just a matter of style. At least the procedure was invoked from the application using bind parameters...

The post Book Review - Python Testing Cookbook by Greg L. Turnquist first appeared on http://blog.zsoldosp.eu.