Piccolo

Tue, 08 Oct 2024 00:00:00 GMT

>> await Band.objects() # list[Band] ``` Wouldn't it be great if we could write a test, to make sure the type annotations don't break? Just as we write unit tests, we can do something similar for our type annotations. We do this using ``assert_type``. Here's an example using mypy: ```python # main.py # For Python 3.11 and above: from typing import assert_type # Otherwise `pip install typing_extensions`, and use the following: from typing_extensions import assert_type # The function needs type annotations otherwise mypy will ignore it: async def test() -> None: # This will pass: assert_type(await Band.objects(), list[Band]) # This will fail: assert_type(await Band.objects(), str) ``` ``mypy`` will show an error if the type assertion fails: ``` >>> mypy main.py main.py: error: Expression is of type "list[Band]", not "str" [assert-type] ``` Check out his [type checking file in Piccolo](https://github.com/piccolo-orm/piccolo/blob/fdb703f4abf461dc323776d9f2611a1dc92a6c92/tests/type_checking.py) - we use it to make sure that all sorts of queries have the correct type annotations. We run the tests as part of our CI pipeline, which lets us know if something breaks. Not every project will require these kinds of tests, but for libraries, and certain apps, it can be incredibly useful. ## Related To learn more about ``TypeVar``, check out our [article about it](../advanced-type-annotations-using-python-s-type-var/). ]]>

Sat, 07 Jan 2023 00:00:00 GMT

str: return f'Hello {name}' ``` We pass in a string, and return a string - nice and easy. ## Advanced annotations using ``TypeVar`` There are some situations where we have to get more creative with our type annotations. Consider the function below, which doubles the number we pass into it: ```python def double(value: int | float | decimal.Decimal): return value * 2 ``` Several value types are allowed (``int``, ``float`` and ``Decimal``). We could add the following return type: ```python def double( value: int | float | decimal.Decimal ) -> int | float | decimal.Decimal: return value * 2 ``` But when you think about it, it doesn't really make sense. When we pass in an ``int``, we should get an ``int`` returned. What this type annotation is saying is that when we pass in an ``int``, then we could get back an `int`, `float` or `Decimal`. This is where `TypeVar` comes in. It allows us to do this: ```python import decimal from typing import TypeVar Number = TypeVar("Number", int, float, decimal.Decimal) def double(value: Number) -> Number: return value * 2 ``` This tells static analysis tools like [``mypy``](https://mypy.readthedocs.io/en/stable/) and [``Pylance``](https://marketplace.visualstudio.com/items?itemName=ms-python.vscode-pylance) that the type returned by the function is the same as the type which was passed in. It also tells the type checker that values other than ``int``, ``float`` and ``Decimal`` aren't allowed: ```python double("hello") # error ``` Piccolo uses ``TypeVar`` extensively - without it, it would be impossible to provide correct types for certain functions. Give it a go! ]]>

Mon, 17 Oct 2022 00:00:00 GMT

Sun, 04 Sep 2022 00:00:00 GMT

Cypress test results

Testing is an integral part of software development, and arguably even more so with open source projects. Piccolo and its related projects have extensive test suites, mostly consisting of unit tests for the backend Python code. One of the most important parts of the Piccolo ecosystem is [Piccolo Admin](https://github.com/piccolo-orm/piccolo_admin/), a powerful admin interface / content management system. It contains a lot of UI code (written with Vue.js), and builds upon the rest of the Piccolo ecosystem ([`piccolo_admin`](https://github.com/piccolo-orm/piccolo_admin/) is built on [`piccolo_api`](https://github.com/piccolo-orm/piccolo_api/), which is built on [`piccolo`](https://github.com/piccolo-orm/piccolo/)) By running integration / UI tests on Piccolo Admin we can make sure that the Piccolo ecosystem of libraries is working together as expected. ## Why Cypress? [Cypress](https://github.com/cypress-io/cypress) has become very popular. It's developer friendly, and productive. It's similar to tools like [Selenium](https://en.wikipedia.org/wiki/Selenium_(software)). You write tests in Javascript, and the tests run within a web browser (typically headless Chrome). The tests are simple to write - in the example below, we type some content into a form, and then submit it: ```javascript // Fill the username cy.get('[name="username"]') .type('piccolo') .should('have.value', 'piccolo'); // Fill the password cy.get('[name="password"]') .type('piccolo123') .should('have.value', 'piccolo123'); // Locate and submit the form cy.get('form') .submit(); // Make sure the correct page was rendered cy.location('pathname', { timeout: 5000 }) .should('eq', '/'); ``` You can see some [full examples in our Git repo](https://github.com/piccolo-orm/piccolo_admin/tree/master/admin_ui/cypress/integration). By writing these automated tests, we increase our confidence in the code base, and it reduces the amount of manual testing we have to do. ## GitHub Actions Having Cypress tests is great, but to get the maximum value we need to run them as part of our CI pipeline. Luckily there's an official tool for doing this: [cypress-io/github-action](https://github.com/cypress-io/github-action). You can see our [full YAML config here](https://github.com/piccolo-orm/piccolo_admin/blob/master/.github/workflows/cypress.yaml), and an [example of a successful pipeline run](https://github.com/piccolo-orm/piccolo_admin/actions/runs/2989278617). Once the pipeline has run, we can see how many tests passed / failed:

Cypress results on GitHub Actions

Even though the tests are running in headless Chrome, Cypress can generate screenshots and videos of the tests, so we save these as artifacts:

Cypress artifacts on GitHub Actions

The artifacts are stored in a zip file, which we can download and inspect. When I started using Cypress, the screenshots and videos were something which really impressed me. Here we can see a test in action - which navigates around the app, and submits some forms: It's like having our own personal android! ## Moving forward Now we have some initial Cypress tests, and have it integrated with our CI, where do we go next? There are lots more Cypress tests left to write, and we want to get to a position where every new feature is accompanied by a set of Cypress tests. If you want to [get involved](https://github.com/piccolo-orm/piccolo_admin/), and learn more Cypress, then you're welcome to join us. Cypress is suprisingly fun, and a valuable skill to learn. ## Update - now using Playwright! We ended up migrating to [Playwright](https://playwright.dev/), which is a similar framework, but the tests can be written in Python. Being able to write the tests in Python is a huge boon for us, as we can test the UI (for example submitting a form), and then use Piccolo to query the database to make sure the data was modified. I'm still a fan of Cypress, but Playwright is the obvious choice for Python developers. ]]>

Thu, 21 Jul 2022 00:00:00 GMT

Piccolo Admin, bulk update

You can learn more in the video below: ]]>

Mon, 23 May 2022 00:00:00 GMT

Mon, 21 Feb 2022 00:00:00 GMT

Mon, 24 Jan 2022 00:00:00 GMT

>> I was accessed parent.child = 1 >>> I was a assigned a new value ``` There are lots of interesting use cases. When a value is assigned we could: - Store it in an external database. - Invalidate a cache. - Refresh some UI (it's not too dissimilar to how reactivity is handled in Vue JS). When a value is read we could: - Calculate the value dynamically. - Fetch the value from an external source. - Log the value. ## Context What makes the descriptor protocol extra interesting is the `obj` argument which is provided to the `__get__` and `__set__` methods. The `obj` argument is either `None` or a class instance. - When `obj` is `None`, then the the attribute was accessed on a class (i.e. `Parent.child`). - When `obj` is a class instance, the attribute was accessed on that instance (i.e. `Parent().child`). We're able to customise the behaviour depending on where it was called from. A trivial example: ```python class Child: def __get__(self, obj, objtype=None): if obj is None: print("I was accessed from a class.") else: print("I was accessed from a class instance.") ``` In an ORM like Piccolo, having this information is incredibly value. In the example below, the `name` attribute represents the column type: ```python class Band(Table): name = Varchar() ``` But when we do a database query, the name attribute returns the value in the database instead. ```python band: Band = await Band.objects().first() >>> band.name 'Pythonistas' >>> type(band.name) str ``` Being able to have correct type annotations was a huge head scratcher - how do you have correct type annotations for an attribute which is context dependent? It turns out we can do this using descriptors: ```python class Varchar(Column): ... @typing.overload def __get__(self, obj: Table, objtype=None) -> str: ... @typing.overload def __get__(self, obj: None, objtype=None) -> Varchar: ... def __get__(self, obj, objtype=None): # This is Piccolo specific: return obj.__dict__[self._meta.name] if obj else self ``` [MyPy](https://mypy.readthedocs.io/en/stable/) now knows when the `name` is a `Varchar`, and when it's a `str`. ## Conclusions This just scratches the surface of descriptors. As mentioned in the intro, they're not needed every day, but they help us solve really tricky problems, and unlock some interesting design space for Python libraries. ## Resources - [An official guide on python.org](https://docs.python.org/3/howto/descriptor.html) ]]>

Sat, 08 Jan 2022 00:00:00 GMT

Tue, 12 Oct 2021 00:00:00 GMT

A simple database schema

When creating the tables, we need to make sure that the `Manager` table is created before the `Band` table, as there's a foreign key from `Band` to `Manager`. In the parlance of graphs, each table is a node, and each foreign key is an edge. You might think, OK - let's just use Python's built-in [`sorted` function](https://docs.python.org/3/library/functions.html#sorted) to determine the correct order. For complex graphs, with multiple nodes and edges, the sort function just doesn't work. The `sorted` function works in situations like this: ```python >>> sorted([1,3,2,5,4]) [1,2,3,4,5] ``` When sorting more complex types, you can pass a `key` argument to `sorted`, telling it how to compare the various elements. But when each element in the list has complex relationships to other elements in the list, the output won't be what you expect. Thankfully `graphlib` comes to the rescue. Tools similar to `graphlib` have existed for a long time (for example [NetworkX](https://networkx.org/)), but having something in the standard library which solves common use cases is very welcome. All we have to do to sort the above schema is this: ```python from graphlib import TopologicalSorter # The graph is a dictionary mapping nodes to a set of connected nodes. graph = {'band': {'manager'}, 'manager': set()} sorter = TopologicalSorter(graph) ordered = tuple(sorter.static_order()) >>> print(ordered) ('manager', 'band') ``` That was a trivial example, here's a slightly more complex schema:

A slightly more complex database schema

```python from graphlib import TopologicalSorter graph = { 'band': {'manager'}, 'manager': set(), 'concert': {'band', 'venue'}, 'venue': set(), } sorter = TopologicalSorter(graph) ordered = tuple(sorter.static_order()) >>> print(ordered) ('manager', 'venue', 'band', 'concert') ``` I encourage you to check `graphlib` out - it's really useful, and quite fun. ]]>

Mon, 11 Oct 2021 00:00:00 GMT

Thu, 23 Sep 2021 00:00:00 GMT

Sun, 08 Aug 2021 00:00:00 GMT

Thu, 10 Jun 2021 00:00:00 GMT

Some interesting features of BlackSheep are: * **OpenAPI support** - BlackSheep can [automatically create OpenAPI docs](https://www.neoteroi.dev/blacksheep/openapi/) from the type annotations of your endpoints, similar to FastAPI. * **Performance** - some of the BlackSheep internals are implemented in Cython, which should help deliver good performance. * **Flexible design** - endpoints can be class based or function based. Check out the [docs](https://www.neoteroi.dev/blacksheep/) for more details. ]]>

Thu, 10 Jun 2021 00:00:00 GMT

>> Director.select().where( >>> Director.gender == Director.Gender.male >>> ).run_sync() [{'id': 1, 'name': 'George Lucas', 'gender': 'm'}, ...] >>> director = Director( >>> name="Brenda Barton", >>> gender=Director.Gender.female >>> ) >>> director.save().run_sync() ``` [Piccolo Admin](/ecosystem/) also supports this feature. When a column has choices specified, a select widget is rendered in the UI.

]]>

Thu, 22 Apr 2021 00:00:00 GMT

>> python successful_script.py >>> echo $? 0 >>> python failing_script.py >>> echo $? 1 ``` You can then use these exit codes in things like `if` statements. ```bash >>> if python successful_script.py; then echo "successful"; else echo "error"; fi; successful >>> if python failing_script.py; then echo "successful"; else echo "error"; fi; error ``` Also, exit codes are very important in build tools like Docker. If the exit code indicates a command has failed, then the build will fail. ## Exit messages If you want to indicate why the failure occured, then a string can be passed to `sys.exit`, in which case the exit code is treated as `1` (i.e. a failure), and the string is printed out. ```python # failing_script.py import sys sys.exit("Something bad happened") ``` Let's try it: ```bash >>> python failing_script.py Something bad happened ``` ## How does sys.exit work? When calling `sys.exit` it actually just raises an exception. The exception is `SystemExit`. It's unusual for a codebase to catch `SystemExit` exceptions - but it can be done. ```python # refuse.py import sys try: sys.exit(1) except SystemExit: print("I refuse!") ``` If we call it: ```bash >>> python refuse.py I refuse! >>> echo $? 0 ``` ## Are there other exit codes? In 99% of situations, `0` and `1` are sufficient as exit codes. There are [others](https://tldp.org/LDP/abs/html/exitcodes.html) though, but using them is rare. ```python import sys sys.exit(127) # 127 means 'command not found' ``` ## Why not just raise exceptions instead? Rather than using `sys.exit`, you can just raise an exception. ```python # exception_script.py raise Exception('Something went wrong') ``` If this exception in unhandled, and causes the program to crash, the exit code will be `1`, and a traceback will be printed out: ```bash >>> python exception_script.py Traceback (most recent call last): File "exception_script.py", line 1, in raise Exception("Something went wrong") Exception: Something went wrong ``` Having more verbose output may be useful for debugging purposes. However, you don't necessarily want this level of information being shown to a user if it's a known exception. By using `sys.exit` you can exit the program, and just show a message without a traceback. Also, by using `sys.exit`, it indicates clearly within your code that the intention is to stop the program, vs an exception, which is often meant to be handled. ## Conclusions So, should you use `sys.exit`? In summary, here are some situations where it is useful: * If you're writing code which will be consumed on the command line, and you want to exit the program without showing a traceback. * If you want to return an exit code other than 0 or 1. * To indicate clearly within your code that the intention is to stop the program. ]]>

Thu, 08 Apr 2021 00:00:00 GMT

Often database tables get quite complex, and providing hints to the user about what a table is for, and what the columns represent, can be very helpful. Adding them is very simple: ```python # tables.py from piccolo.columns import Varchar from piccolo.table import Table class Movie(Table, help_text="Movies which were released in cinemas."): name = Varchar(help_text="The name it was released under in the USA.") ``` And then we just run the admin as usual: ```python # app.py from piccolo_admin.endpoints import create_admin from tables import Movie app = create_admin(tables=[Movie]) if __name__ == '__main__': import uvicorn uvicorn.run(app) ``` ]]>

Sun, 14 Mar 2021 00:00:00 GMT

The Piccolo Admin, in dark mode

## Generating lots of fake data The first step in this process was generating lots of fake data for testing with. The example schema used in the Piccolo Admin contains two tables - `Movie` and `Director`. The original dataset was painstakingly collected via Google searches - e.g. finding out what each movie grossed, and if they had Oscar nominations. Clearly this wasn't going to scale if we wanted to test with millions of rows. Plus there are only so many actual movies in existence. To generate fake data, but keeping it semi-realistic, the [Faker](https://pypi.org/project/Faker/) library was used. The benefit of using semi-realistic data, is it's easier to get a better sense of the user experience, compared to using Lorem ipsum everywhere. You can see the source code used for this [here](https://github.com/piccolo-orm/piccolo_admin/blob/6cd17f63b1d80c109695dbea3a6ab198be8868df/piccolo_admin/example.py#L91). ## What are the bottlenecks? After generated lots of fake data, we could identify the main bottlenecks. ### Pagination Currently the Piccolo Admin uses limit-offset pagination, which isn't efficient when the page number is high. However, even at very high page numbers, it's still usable. It just puts unnecessary load on the database. For a page size of 100, and reading page 1,000, the database will read 100,000 rows, and will throw away the first 99,900. That's just the way offset is implemented in Postgres. Work has started on more efficient pagination methods, but for now, it's still usable at high row counts. ### Foreign key selectors For the `Movie` table, each row has a foreign key to a `Director` row. The user needs an efficient way of selecting the director when inserting / editing rows, and also when filtering. This is the main bottleneck for supporting large database tables. If a simple select element is used, it needs to load all possible options for a director, which means loading the ID and an identifier (e.g. director name) for every row in the `Director` table. Clearly this won't scale well in terms of performance. It also isn't a great UI - as the user needs to scroll through thousands of options in a select element to find the one they're after. The solution is to use a search input instead. For the filter sidebar, this has now been implemented. But for the edit and add pages, it will be implemented soon.

Empty search field

Search field with content

## Conclusions The recent improvements are a good start in making the Piccolo Admin scalable. We'll continue to make the UI and performance as good as possible with large datasets. ]]>

Sun, 28 Feb 2021 00:00:00 GMT

Hypercorn: Daphne: ## Conclusions Uvicorn achieved roughly 40% more throughput than the others in this test. However, they all did well, and were stable under high load. ]]>

Wed, 24 Feb 2021 00:00:00 GMT

>> python app.py app.py:16: DeprecationWarning: my_regrettable_function will be retired in version 1.0, please use my_awesome_function instead. my_regrettable_function() ``` If you like, you can redirect stderr into a different file, to capture all of the warnings: ``` >>> python app.py 2> warnings.txt ``` Or just straight up ignore the warnings all together: ``` >>> python -W ignore app.py ``` One other cool thing you can do is to turn the warnings into exceptions, to be sure you're not running any deprecated code: ``` >>> python -W error app.py ``` The issue here, is any deprecated code within Python itself will start raising Exceptions. To just target warnings within a given module: ``` >>> python -W error::DeprecationWarning:__main__ app.py Traceback (most recent call last): File "app.py", line 17, in my_regrettable_function() File "app.py", line 6, in my_regrettable_function warnings.warn( DeprecationWarning: my_regrettable_function will be retired in version 1.0, please use my_awesome_function instead. ``` ]]>

Tue, 23 Feb 2021 00:00:00 GMT

0: iterations += math.ceil(remainder / chunk_size) for i in range(iterations): chunk = coroutines[i * chunk_size : (i + 1) * chunk_size] await asyncio.gather(*chunk) if __name__ == "__main__": for test in (run, run_batched): start = time.time() asyncio.run(test()) end = time.time() delta = end - start print(delta) ``` With `COROUTINE_COUNT=10000`: * Unbatched: 0.51 seconds * Batched: 1.69 seconds With `COROUTINE_COUNT=100000`: * Unbatched: 6.24 seconds * Batched: 5.86 seconds This was run on a 2.6 GHz Intel Core i7 (9th Gen) processor, with 16 GB of RAM. You'll see that batching up coroutines is much slower, unless we get to incredibly high numbers of coroutines (100,000). I didn't expect this. I thought the event loop would struggle much sooner. Let's try the experiment again, but replacing `asyncio.sleep` with actual network calls. ```python import httpx async def test_coroutine(): """ Doing actual network calls now. """ async with httpx.AsyncClient() as client: response = await client.get("https://www.google.co.uk") assert response.status_code == 200 ``` With `COROUTINE_COUNT=100`: * Unbatched: 2.12 seconds * Batched: 2.90 seconds As you can see, batching is also slower in this case. When trying with `COROUTINE_COUNT=1000` I started getting network timeouts. In this situation, batching does make sense - if you run all of the coroutines at once you're more likely to encounter network issues. The same is true when connecting to a database - unless you're using a connection pool, you will start seeing errors if Postgres has more than 100 open connections. ## Conclusions The asyncio event loop is surprisingly good at handling large numbers of coroutines concurrently. However, be wary of scheduling too many coroutines which require network access, as you'll hit other bottlenecks (rate limiting, network etc). ]]>

Wed, 11 Nov 2020 00:00:00 GMT

Fri, 26 Jun 2020 00:00:00 GMT

Mon, 18 May 2020 00:00:00 GMT

=10.2,<11. This is a reasonable solution. It does create some ambiguity though, which could result in bugs. It's unlikely you'll run unit tests for your project with every dependency version in the range. It's important that the developer is disciplined with their package versioning, so the major version is incremented if there are any backwards incompatible changes. ## Sanity check I think it's really important for a library author to have a sanity check in place, so they know that the latest version of their library can be installed, and works as expected. For the piccolo admin, I deploy a [demo site](http://demo1.piccolo-orm.com/), so I can check it all still works at a high level. ## Conclusions For Piccolo, I've decided to loosen the dependency requirements. I'd also like to get to version 1.0 very soon. ]]>

Wed, 22 Apr 2020 00:00:00 GMT

>> python main.py add 1 1 2 ``` To get documentation: ```bash >>> python main.py add --help add === Add the two numbers. Usage ----- add a b Args ---- a The first number. b The second number. ``` I encourage you to give it a try. * [Github](https://github.com/piccolo-orm/targ) * [Read the docs](https://targ.readthedocs.io/en/latest/index.html) More advanced features are coming soon. ]]>

Sun, 15 Mar 2020 00:00:00 GMT

]]>

Mon, 02 Mar 2020 00:00:00 GMT

Wed, 26 Feb 2020 00:00:00 GMT

Sun, 23 Feb 2020 00:00:00 GMT

>> main() Handled exception ``` However, there are some situations where things get interesting. ## asyncio.gather You can use [asyncio.gather](http://localhost:8080/blog/asyncio-gather/) to launch several coroutines, which are then executed concurrently. ```python import asyncio async def hello(): # To simulate a network call, or other async work: await asyncio.sleep(1) print('hello') async def main(): await asyncio.gather( hello(), hello(), hello() ) >>> asyncio.run(main()) hello hello hello ``` What happens if one of the coroutines raises an exception? The default behavior is for the first exception raised by any of the coroutines to be propagated to the call site of asyncio.gather. The other coroutines continue to run. If more than one of the coroutines raises an exception, you won't be aware of it. If you need to run some clean up code to handle an exception (for example, rolling back a transaction), then you could potentially miss it if a different coroutine raises an exception first. You might also wonder when the exception is handled - is it as soon as it's raised, or only when all of the coroutines have completed? Fortunately, asyncio.gather has an option called **return_exceptions**, which returns the exceptions instead of raising them. ```python import asyncio async def good(): return 'OK' async def bad(): raise ValueError() async def main(): responses = await asyncio.gather( bad(), good(), bad(), return_exceptions=True ) print(responses) # >>> [ValueError(), 'OK', ValueError()] ``` We are now aware of every exception which happened. But as a programmer, what do we do with a list of values and exceptions? It feels quite alien. To solve this problem, I created a library called [asyncio_tools](https://github.com/piccolo-orm/asyncio_tools), which wraps `gather` to make it more user friendly. ```python import asyncio_tools async def good(): return 'OK' async def bad(): raise ValueError() async def main(): response = await asyncio_tools.gather( bad(), good(), bad(), ) # We can easily get just the successful results print(response.successes) # >>> ['OK'] # And the exceptions. print(response.exceptions) # >>> [ValueError(), ValueError()] # We can easily check if we got a certain type of exception if ValueError in response.exception_types: print('Received a ValueError exception') # We can combine all of the exceptions into a 'CompoundException': exception = response.compound_exception() if exception: raise exception() ``` If we raise a `CompoundException`, it allows us to return information about several exceptions. When we catch such an exception, we can do the following: ```python async def main(): try: await some_coroutine() except asyncio_tools.CompoundException as exception: print(exception) # >>> 'CompoundException, 2 errors [ValueError, ValueError]' if ValueError in exception.exception_types: print('Caught a ValueError') ``` This makes handling exceptions in concurrent code easier - I encourage you to check it out. ]]>

Sat, 22 Feb 2020 00:00:00 GMT

An example asyncio program

In the above diagram, you can see an example asyncio program. * The entry point is a coroutine which is run using `asyncio.run`, which wraps it in a task. * Whenever a new task is created, a snapshot of the parent context is taken, and this applies to the new task. Any subsequent changes to the parent context don't apply to the child task. Even though it might seem like lots of things are going on at once in an asyncio program, in reality it's just hopping between different tasks, which have their own context. We can use context managers to manipulate the context in the task - it won't bleed out to the other existing tasks, as they took a snapshot of the context when they were created. Here's an example: ```python from contextvars import ContextVar from my_library import get_connection # If we don't give it a default, then it raises a LookupError if we try and # access the value using connection.get(), without having first set a value # using connection.set(some_value). connection = ContextVar(connection, default=None) # This is similar to what Piccolo does: class Transaction(): async def __aenter__(self): self.connection = await get_connection() self.transaction = await connection.get_transaction() self.token = connection.set(self.connection) await self.transaction.start() async def __aexit__(self, exception_type, exception, traceback): if exception: await self.transaction.rollback() else: await self.transaction.commit() await self.connection.close() # This removes the connection from the current context: connection.unset(self.token) async def run_in_transaction(sql): # We don't have to pass the connection explicitly - we can get it from # the context. _connection = connection.get() if _connection: return await _connection.run(sql) async def main(): async with Transaction(): await run_in_transaction('select * from foo') if __name__ == '__main__': asyncio.run(main()) ``` ## Resources * [A good article on contextvars](https://www.pythoninsight.com/2019/03/context-variables/) ]]>

Sun, 16 Feb 2020 00:00:00 GMT

Sat, 15 Feb 2020 00:00:00 GMT

Tue, 10 Sep 2019 00:00:00 GMT

>> employee._Person_name 'Bob' >>> employee._Employee_name >>> 'Security Guard' ``` Python automatically modifies the attribute name to contain the name of the class. This prevents collisions. It's useful for libraries which provide base classes which are meant to be subclassed by users. It's very cool, but you don't see it used often, most likely because it's not widely known about. Note, it only works if the attribute name has less than one trailing underscore, to avoid confusion with magic methods (see below). ## Magic methods Magic methods (also called dunder methods) are attributes which have double underscore prefix and postfix. The one most people know is `__init__` in classes. It allows Python to implement functionality transparently, without adding additional syntax. For example, calling an object just calls it's `__call__` method. Instantiating an object just calls its `__init__` method. It's tempting for library authors to use this dunder syntax for their own variables but it's not recommended. The magic methods are how Python implements some important functionality, and what if they add a new magic method in the future, which clashes with your own variable? It's unlikely, but best to be safe. Consider the dunder namespace to be just for the Python runtime. ## Nested classes In some libraries you'll see this: ```python # Representing a database table. class Table(): class Meta(): tablename = 'awesome_table' ``` If we'd done this instead: ```python class Table(): tablename = 'awesome_table' ``` It means the user can't subclass `Table`, and define their own tablename variable without breaking the library in some way. ```python # We can now do this, if we wanted a tablename column on our table: class MyTable(Table): tablename = CharField() class Meta(): tablename = 'awesome_table' ``` But what's actually going on when we define classes inside classes? There's actually nothing weird about this in Python - it's just like declaring any other attribute. We access them as you'd expect - `MyTable.Meta.tablename` or `MyTable().Meta.tablename`. One advantage of this approach, is as it's generally just used in libraries, it's unlikely a user would want to define a Meta attribute of their own, which avoids a naming collision. A disadvantage is the inner class can't access attributes in the outer class (for example, inside a method). In theory you can do some metaclass magic to bind a reference to the outer class in the inner class, but this is seriously advanced / dubious jank. Also, nested classes can seem a bit strange to users at first, who might be confused by it. ### Inheritance We can use inheritance on the nested class, which is quite interesting: ```python class Table(): class Meta(): foo = 1 class MyTable(Table): class Meta(Table.Meta): bar = 2 MyTable().Meta.foo >>> 1 MyTable().Meta.bar >>> 2 ``` ## Dictionaries Alternatively, we can just use a dictionary: ```python class MyTable(Table): tablename = CharField() meta = { tablename = 'awesome_table' } ``` Which works fine if you just want some simple values. Classes allow you to namespace methods too, so are generally preferable. ## Naming conventions Finally, we can just prefix our attributes with an identifier, in this case `piccolo_`: ```python class MyTable(Table): piccolo_tablename = 'awesome_table' ``` ## Conclusions We've looked at a few different solutions for namespacing attributes. In an ideal world, our classes are simple enough to not need any of these techniques, but in larger libraries like Piccolo it can lead to more understandable and robust code. ]]>

Sat, 10 Aug 2019 00:00:00 GMT

{% csrf_token %} {{ form.as_p }} ``` When the form is submitted, the website also sends back the CSRF token contained in the cookie. Django checks that the token in the cookie matches the hidden field value in the form. This protects against CSRF because a malicious website is unable to read/set cookies on another domain. So a malicious link on `evil.com` wouldn't know what value to set the hidden form field to. If you have a Single Page Application, then Javascript is used to add the CSRF token as a header to all AJAX calls instead. ## CSRF protection and mobile apps CSRF is only a problem with browsers. However, if you have an API which is used by mobile apps as well, you need to work around it. CSRF protection usually involves more than just checking for the presence of a token - it also looks at the referer header. You could try spoofing all of this in the mobile app, so it sets all the appropriate HTTP headers manually. It's generally just best to bypass CSRF checks for non-web apps, as it doesn't really serve a purpose. In Django's case, this is a by adding a `csrf_exempt` decorator to your view. This becomes a bit messy though if we need to maintain two separate views - one which enforces CSRF validation, and one that doesn't. One limitation of Django is we can only apply middleware globally to all views. With ASGI, and frameworks like Starlette, we can apply middleware to a subsets of views. With JWT based authentication, we can add a claim such as ``mobile``, which allows an app to access views without a CSRF token. Conside this pseudo code: ```python views = [view1, view2, view3] app = Router({ '/mobile': CheckJWTClaimMiddleware(views) '/web': CSRFMiddleware(views) }) ``` This allows us to support mobile and web apps. ]]>

Thu, 08 Aug 2019 00:00:00 GMT

>> User('Shirley', 'Jones').full_name Shirley Jones ``` The problem is they can cause confusion. ```python class User(): def __init__(self, first_name, last_name): self.first_name = first_name self.last_name = last_name @property def get_full_name(self): return f'{self.first_name} {self.last_name}' # Feels weird: >>> User('Shirley', 'Jones').get_full_name Shirley Jones ``` Just by changing the method name, it feels unnatural for this to be a property. Calling `my_user.get_full_name` feels like it should have brackets after it, because it sounds like a function. So naming is definitely important when using properties. Also, properties work great if you're confident you won't need to add any arguments in the future. Imagine we wanted to modify `get_full_name` so it had an `include_title` argument. If we implemented it as a property, we'll break everyone's code, because now it'll have to be called as a function to work properly: ```python class User(): def __init__(self, first_name, last_name): self.first_name = first_name self.last_name = last_name def get_full_name(self, include_title=False): fullname = f'{self.first_name} {self.last_name}' if include_title: fullname = f'Madam {fullname}' return fullname # We broke our existing code: >>> User('Shirley', 'Jones').get_full_name Error! ``` This might not matter much in small projects, but if you're a library author you don't want to introduce a breaking change just by adding an argument to a property. In API design, properties can be overused too. If you're designing a fluent interface, you don't want to add a cognitive load to a programmer by making them consider 'is this a property or a method?'. Take this example: ```python class Select(Query): def where(self, query) -> Select: # do stuff return self @property def first(self) -> Select: # do stuff return self def run(self): return 'some data' ``` To use this API: ```python select = Select().where(some_query).first.run() ``` Rather than having to remember that `first` is a property, it's cleaner to have them all as plain methods. ```python select = Select().where(some_query).first().run() ``` Sure, it takes a couple more key strokes, but sometimes consistency is king. And lastly, perhaps the main way properties can be abused is if a really heavy piece of computation, or a long network request, is done to generate the response. A developer could unexpectedly cripple their app's performance by calling an innocent looking property too many times. So in conclusion, properties can be great - but consider if you really need them, and if so keep them simple. ]]>

Mon, 22 Jul 2019 00:00:00 GMT

Select: # lots of code return Select() ``` The reason this so useful, is a tool like Jedi can easily infer the return type, without having to work it out from the actual function body. It's really important for methods which are part of a fluent API. It allows tab completion in situations like this: ```python # We can continue using tab completion even after a method call: Band.select().where(Band.name == 'Radiohead').first().run_sync() ``` ## Mixins can be problematic Piccolo originally consisted of a bunch of Query subclasses like `Select`, `Insert`, `Delete` etc. Shared functionality like 'where' clauses were implemented via mixins. ```python # Some early Piccolo pseudo-code class WhereMixin(): def where(self, values): # do some stuff return self class Select(Query, WhereMixin): pass ``` You'll see here that mixins are problematic - since the `WhereMixin` can be used anywhere, the return type of the `where` method could be anything. This is clearly a big problem for tab completion. The way around this is to not use Mixins, and use composition instead. ```python # Some early Piccolo pseudo-code class WhereDelegate(): def where(self, values): # do some stuff return class Select(Query): def __init__(self): self.where_delegate = WhereDelegate() def where(self, values) -> Select: self.where_delegate.where(values) return self ``` Now we're able to specify a concrete return type. ## Decorators can be deceiving If decorators aren't implemented correctly, they can mask the signature of the function being decorated. Take this example: ```python def my_decorator(func): def wrapper(): print('I am wrapped') func() return wrapper @my_decorator def hello_world() -> str: return 'hello world' hello_world() >>> I am wrapped hello_world.__name__ >>> 'wrapper' hello_world.__annotations__ >>> {} ``` In the example above, the annotations and original function name have been lost. It's effectively giving false information to any introspection tools, like Jedi. You can fix this though: ```python from functools import wraps def my_decorator(func): @wraps(func) def wrapper(): print('I am wrapped') func() return wrapper @my_decorator def hello_world() -> str: return 'hello world' hello_world.__annotations__ >>> {'return': str} hello_world.__name__ >>> 'hello_world' ``` If tab completion is a high priority, keep decorators simple and make sure you use `wraps`. The `wraps` function copies some important attributes from the wrapped function to the decorator (including `__name__`, `__annotations__`, and `__doc__`). Making decorators accurately reflect the wrapped function is a surprisingly deep subject. This is a great [article](http://blog.dscpl.com.au/2014/01/how-you-implemented-your-python.html) on the subject, which is part of an entire series of [articles](https://github.com/GrahamDumpleton/wrapt/tree/develop/blog). ## Some setattr magic This is something particular to Piccolo, and not every project will require it. When you enter say `Band.manager` (where `manager` is a foreign key), it would be nice to be able to keep on using tab completion to see the columns on the `Manager` table. And likewise, if the `Manager` table contains any foreign keys, to be able to follow them using tab completion as well. With other ORMs, you would express this using a string. For example, in Django it would be a string like `'manager__name'`. This is fine, but when you have large, complex models, it's nice to have tab completion. The way Piccolo achieves this is when you call `Band.manager`, the constructor creates an attribute on the object for each column in the table the foreign key points to. So for the `name` column, a `name` attribute is created on on the object - allowing you to do `Band.manager.name`. ## Conclusions Tab completion is a powerful tool for developers, and with a bit of thought we can create libraries which leverage it to its fullest. ]]>

Wed, 23 Jan 2019 00:00:00 GMT

Tue, 22 Jan 2019 00:00:00 GMT

Tue, 15 Jan 2019 00:00:00 GMT

>> {'user': User} ``` This makes the annotations easier to access than parsing a docstring, and allows for some interesting applications. ## Mypy [Mypy](http://mypy-lang.org/) uses the type annotations to analyse your code for errors. ```python def say_hello(name: str): print(name) say_hello(1) # Error! ``` Visual Studio Code supports it out of the box. Combined with a linter like Flake8, your editing experience is super charged - catching most coding errors you're likely to encounter. Having type checks provides you with an extra level of confidence that your code is working as expected. This is especially useful when refactoring large projects. ## Progressive enhancement One criticism you sometimes here is why not just use a statically typed language? What's nice about MyPy (and also it's companion in the Javascript world - Typescript), is you can add type annotations incrementally. Creating a quick and dirty prototype? Leave the annotations out for now. A library can use type annotations (like Piccolo), and the user doesn't need to care - they can use Python as they always have. But the library author has that extra level of confidence that their code works as expected. ## Advanced examples To finish off, here are some examples of the interesting things you can do with type annotations in Python. ```python import typing as t # Importing it as an alias makes it less verbose # You can assign type annotations to variables: Pet = t.Union[Dog, Cat, Hamster] # pet can be a Dog, Cat, or Hamster def say_name(pet: Pet): print(pet.name) # license_number can be None or an int def create_driver(name: str, license_number: t.Optional[int] = None): print(f'Creating {name} with license {license_number}') class Dog(): # In Python 3.7 forward references are allowed i.e. the # return type can be the current class being defined. def return_friend(self) -> Dog: return some_dog # If you want to return a type defined in another file, and # are only importing it for use as a type annotation, you # can do this: if t.TYPE_CHECKING: import Budgie from animals # type annotations can also be used on variables budgies: t.List[Budgie] = [] ``` As you can see, the typing module is already very powerful - give it a go! ]]>

Fri, 07 Dec 2018 00:00:00 GMT

Thu, 29 Nov 2018 00:00:00 GMT

Sat, 03 Nov 2018 00:00:00 GMT

Wed, 10 Oct 2018 00:00:00 GMT

>> 0 _counter.__next__() >>> 1 _counter.__next__() >>> 2 ``` In early versions of asyncio, generators were used directly. Now the async and await keywords are used instead, but the underlying mechanisms are the same. As well as performance advantages, an event loop also provides some nice abstractions which makes lives easier for developers. In the case of asyncio, you don't have to worry about sockets - they're astracted away. Likewise, you don't have to worry about how a task gets scheduled, the event loop takes care of it too. One of my favourite features that asyncio provides is the gather function: ```python import asyncio async def hello(name): # This would usually involve some IO - to a db or something. print(f'hello {name}') async def hello_everyone(): await asyncio.gather( hello('bob'), hello('sally'), hello('fred') ) print("welcome!") asyncio.run(hello_everyone()) >>> hello bob >>> hello sally >>> hello fred >>> welcome! ``` With asyncio.gather it makes it very easy to wait until a bunch of tasks have all finished. It's an example of the sorts of nice features which can be built on top of the event loop abstraction. And last but not least, event loops make a lot of sense in Python due to the Global Interpretter Lock (GIL), which limits the effectiveness of multi-threaded programs. This makes event loops, which provides concurrency using a single thread, more attractive. ]]>

Fri, 05 Oct 2018 00:00:00 GMT

Mon, 01 Oct 2018 00:00:00 GMT