Jekyll2026-02-05T09:17:44+00:00https://staabm.github.io/feed.xmlMy developer experiencePersonal blog, about my open source activities.Markus StaabPHPStan on steroids2026-01-25T00:00:00+00:002026-01-25T00:00:00+00:00https://staabm.github.io/2026/01/25/phpstan-on-steroidsSimilar to last year, while on vacation of my primary job I focused my open source efforts on improving PHPStan performance.

You might also be interested in other articles of this PHP performance series.

I teamed up with Ondřej Mirtes - the creator of PHPStan - in deep analyzing several PHPStan use-cases. We identified and discussed several potential performance opportunities which resulted in lots of performance oriented changes.

We turned around every stone in the codebase and looked through all PHPStan components to make it as efficient as it can get. It took us ~6 weeks of collaboration which lead us to PHPStan 2.1.34 - a release loaded with patches aimed for speeding it up. I was able to contribute ~100 pull requests alone - not counting the changes Ondřej worked on at the same time.

Our internal “reference benchmark” improved by ~50% in runtime. Running this version pre-release on a few codebases suggested real world projects should see 25 % to 40 % faster analysis times. In the related release discussion we asked our users how they experience the new version. We are happy to see our end-users can reproduce the improvements on real world projects. People report improvements of 8% - 40%.

We would love to see your raw performance numbers - please share them with us.

PHPStan loves InfectionPHP

Another focus of this work was reducing bootstrap overhead so running the PHPStan integration while mutating testing with InfectionPHP gets faster. This area was mostly IO oriented, because when analyzing only few files of a mutation the overall time was dominated by reading and ast-parsing files, and loading the configuration.

In our reference mutation testing example, we measured 70% less files being read/analyzed and the PHPStan invocation got ~40 % faster.

Known problems

We are aware that after the 2.1.34 release analyzing files require more memory. After a few experiments we already got ideas on how we can improve on it. This will be explored in upcoming releases. Stay tuned.

Saving resources all over the world

We expect this changes to considerably reduce the amount of energy used in CI pipelines.

If you are using PHPStan in projects this will also considerably reduce the wait time for the engineers. A shorter feedback loop helps developers to stay in focus and work more efficiently.

Nowadays, this can be especially important when paired with AI tooling workflows, which can be particular slow and/or produce big amounts of source code changes containing bugs which are hard to catch by the human eye.

Chances are high, that you or your company is saving a lot of money with recent releases. Please consider supporting my work, so I can make sure PHPStan keeps as fast as possible and evolves to the next level.

All this performance work have been developed while holiday of my primary job.

]]>
Markus Staab
Speedup PHPUnit code coverage generation2025-11-26T00:00:00+00:002025-11-26T00:00:00+00:00https://staabm.github.io/2025/11/26/speedup-phpunit-code-coverageWhile working on the PHPStan codebase I recently realized we spent a considerable amount of time to generate code-coverage data, which we need later on to feed the Infection based mutation testing process.

Running mutation testing in our continuous integration pipeline based on GitHub Actions took around ~15m 30s in total per PHP version we support. In this article I will describe how I approached this problem and what we came up with.

Most of the following ideas and optimizations will also fit for other PHPUnit code coverage use cases.

Getting a better idea of what is slow

As a very first step I tried to divide the big block of work into smaller parts, to get a better understanding which part actually is slow. Therefore, separating Infections’ preparational initial-tests step from the actual mutation testing was my first take. This can be achieved by running infection with --skip-initial-tests and record the coverage data beforehand in a separate step. The resulting GitHub Actions steps for this look like:

# see https://infection.github.io/guide/command-line-options.html#coverage
- name: "Create coverage in parallel"
run: |
  php -d pcov.enabled=1 tests/vendor/bin/paratest \
    --passthru-php="'-d' 'pcov.enabled=1'" \
    --coverage-xml=tmp/coverage/coverage-xml --log-junit=tmp/coverage/junit.xml \
    --exclude-source-from-xml-coverage

- name: "Run infection"
run: |
  git fetch --depth=1 origin $
  infection \
    --git-diff-base=origin/$ \
    --git-diff-lines \
    --coverage=tmp/coverage \
    --skip-initial-tests \
    --ignore-msi-with-no-mutations \
    --min-msi=100 \
    --min-covered-msi=100 \
    --log-verbosity=all \
    --debug \
    --logger-text=php://stdout

note, that we are using pcov over xdebug to record coverage information, as in our case this was the considerably faster option.

also note, that we are using paratest - which we use for running tests in phpstan-src already - to create coverage information with parallel running workers. before this change, when infection itself triggered the initial test step, this work was done on a single process only.

This leads us to the following results:

  • the total amount of time required to run this dropped to ~12m 30s
  • coverage generation takes ~6m 10s
  • from looking at the paratest output, we see Generating code coverage report in PHPUnit XML format ... done [01:00.714]
  • running infection takes ~6m 20s

Speedup code coverage xml report generation

I was pretty surprised that the xml report generation takes 1 minute alone.

Looking into blackfire profiles of this xml generation process yielded some interesting insight. While working on a few micro-optimizations in the underlying libraries I slowly started to better understand how all this works.

After a chat with php-src contributor Niels Dossche the idea came up, that XML report generation could see a big speed boost after untangling the DOM and XMLWriter implementation. A new pull request which drops the DOM dependency shows we could reach a ~50% faster report generation. While the implementation before this PR was more flexible, I think this flexibility is not worth such a performance penalty. By removing the DOM interactions I feel we made the implementation more direct and explicit.

Faster code coverage data processing

Another idea which came up was looking into the involved data-structures of PHPUnits’ sebastianbergmann/php-code-coverage component.

Reworking the implementation which heavily relied on PHP arrays lead us to ~33% faster data processing for PHPUnits’ --path-coverage option. Inspiration for this change came from a GIST by Nikita Popov, which I found on github.com. It explains in full detail why/when objects use less memory than arrays.

While refactoring the implementation by introducing more immutable objects and reducing unnecessary duplicate work I squeezed out a bit more performance:

Prevent unnecessary work

Sebastian came up with the suggestion of removing the <source>-element from the xml coverage report via opt-in flag.

After playing with the idea it seems this information is not required by Infection, so he added a new --exclude-source-from-xml-coverage CLI option which will be automatically enabled by Infection to speedup the coverage generation when PHPUnit 12.5+ is used.

A test on the PHPStan codebase shows, this can speedup the xml coverage report generation by ~15%.

As a followup support for this new option was added into Infection and ParaTest.

Taking shortcuts

Working on slow processes like code-coverage recording which takes multiple minutes to execute, its vital to take shortcuts which shorten the feedback loop. To assist myself I hacked into the process a few lines of code which serialized the generated CodeCoverage object and stored it as a 998MB file.

Using the pre-recorded data and the following short script made it possible to profile the xml report generation alone, without long waiting for the data recording:

<?php

require_once 'vendor/autoload.php';

use PHPUnit\Runner\Version;
use SebastianBergmann\CodeCoverage\Report\Xml\Facade as XmlReport;

$coverage = unserialize(file_get_contents(__DIR__ . '/coverage-data.ser'));
$config = file_get_contents(__DIR__ . '/coverage.xml');

$writer = new XmlReport(Version::id());
$writer->process($coverage, $config);

I put all this into a separate git repository to allow re-using it in the future.

Results

Applying all those fixes on the phpstan-src codebase yielded a impressive improvement in xml coverage report generation:

Before (PHPUnit 11.5.x)

Time: 04:37.104, Memory: 688.50 MB

Generating code coverage report in PHPUnit XML format … done [00:51.395]

After upgrading to PHPUnit 12.5.x

Time: 04:23.595, Memory: 678.50 MB

Generating code coverage report in PHPUnit XML format … done [00:21.631]

After adding --exclude-source-from-xml-coverage

Time: 04:16.807, Memory: 634.50 MB

Generating code coverage report in PHPUnit XML format … done [00:17.431]

Summary

Working thru all this details and codebases made a lot of fun while also taking a lot of my freetime.

At this point I want to emphasize how important it is to separate the public API of a library/tool/component from the inner workings. Sebastian Bergmann and Arne Blankerts did a great job in the repositories I worked on in this context by declaring classes @internal, so we could easily even do backwards incompatible changes, as long as the top level public API is untouched.

In the future a lot of projects will benefit from these changes by updating PHPUnit and related libraries. Faster tooling processes will also save costly CI-minute resources and people waiting time.

Make sure your boss considers sponsoring my open source work, so I can spend more time on your beloved code quality tooling.

]]>
Markus Staab
New and noteworthy: PHPStan and PHPUnit integration2025-11-15T00:00:00+00:002025-11-15T00:00:00+00:00https://staabm.github.io/2025/11/15/phpstan-validates-phpunit-data-providerIn this article we will have a brief look into the latest update to phpstan/phpstan-phpunit 2.0.8.

PHPStan validates PHPUnit data-providers

One of the features, which I am most proud of is the data-provider validation. It was requested by several people years ago, but we did not yet have a good idea how to make it happen without major changes in the PHPStan core.

Starting with this release, we take each data-set of a data-provider and check it against the signature of the corresponding test-case. That way we can validate whether a data-provider yields all necessary data to fulfill the tests signature.

At the time of writing we support multiple kinds of data-providers:

  • @test
  • #[Test]
  • “test*” method name prefix
  • @dataProvider
  • #[DataProvider]
  • static data-provider
  • non-static data-provider
  • return [] data-providers
  • yield [] data-providers
  • yield from [] data-providers
  • named arguments in data-providers

See it in action:

#[DataProvider('aProvider')]
public function testTrim(string $expectedResult, string $input): void
{
}

public function aProvider(): array
{
   return [
      [
         'Hello World',
         " Hello World \n",
      ],
      [
         // Parameter #2 $input of method FooTest::testTrim() expects string, int given.
         'Hello World',
         123,
      ],
      [
         // Parameter #2 $input of method FooTest::testTrim() expects string, false given.
         'Hello World',
         false,
      ],
      [
         // Method FooTest::testTrim() invoked with 1 parameter, 2 required.
         'Hello World',
      ],
   ];
}

For this to happen we re-use existing rules for method call validation via the newly introduced NodeCallbackInvoker. This new interface allows us to create virtual made-up AST nodes, and handle them like regular method calls.

Related pull requests:

Ignore missingType.iterableValue for data-providers

You likely have been haunted by this error in your test-suite:

Method DataProviderIterableValueTest\Foo::dataProvider() return type has no value type specified in iterable type iterable.

Even in the PHPStan-src codebase this error was ignored by NEON config in the past, as it was really not that useful to repeat all types in every data-provider, which were already present in the test-case method signatures.

As you already saw in the above paragraph we learned how to validate data-providers with this release. We went one step further and re-used the existing validation logic to omit the missingType.iterableValue error only for those data-providers which we are able to validate. This is possible by implementing a new IgnoreErrorExtension.

Improved assertArrayHasKey() inference

Based on a fix in the PHPStan core, we are now able to properly narrow types after a call to PHPUnits’ assertArrayHasKey(). This will help to prevent false positive errors you may have experienced in the past.

PHPUnit version detector

With the addition of PHPUnitVersionDetector we will be able to easily implement rules or extensions tailored to certain PHPUnit versions.

This will be useful in the future, so we can for example assist in PHPUnit migrations and pave the way for a smoother upgrade path.

Performance improvements

People reading my blog or social media posts know my obsession in making things faster. This release is no difference, as some changes have been done to make most PHPUnit specific rules more efficient by reducing unnecessary work.

Easter eggs included

There is even more magic under the hood.

We have a experimental feature in PHPStan which allows us to not just report errors, but also --fix some of them. This new ability was also added to a few assert* rules.

For example, we can turn $this->assertSame(true, $this->returnBool()); into $this->assertTrue($this->returnBool());.

Summary

I spent a lot of time over a few weeks to make the PHPUnit integration shine. We are on a totally new level and even more new cool stuff is getting possible.

Make sure your boss considers sponsoring my open source work, so I can spend more time on your beloved code quality tooling.

]]>
Markus Staab
Mutation testing with Infection in PHP2025-08-01T00:00:00+00:002025-08-01T00:00:00+00:00https://staabm.github.io/2025/08/01/infection-php-mutation-testingRecently I started playing with Infection, a mutation testing framework for PHP. Mutation testing is a technique to evaluate the quality of your tests.

This article describes my personal experience and ideas about mutation testing. I am intentionally not describing the theory behind it, but focus on my personal hands-on experience.

It was a natural choice for me to look into this topic, as it combines several concepts I am interested in:

  • Automated testing
  • Abstract syntax tree (AST) based analysis and manipulation
  • Static analysis
  • Type inference
  • Code quality tooling

After getting used to Infection and applying it to my own projects, I started to contribute to the project itself. As usual, I start reporting issues with ideas for improvements and questions about my understanding of the approach. The maintainers were very responsive and helpful, which made it easy to get started. Within ~2 months I was able to contribute ~85 pull requests.

So let’s have a top level look at the tool.

What to expect from mutation testing?

From my point of view what you get from applying mutation testing is:

  • more precise metric for your test suite quality
  • a copilot for writing better tests
  • dead code detection

more precise metric for your test suite quality

After running Infection on your codebase, you will get a report with a mutation score indicator (MSI). It’s a metric which describes how likely you or your colleagues can introduce a new bug/regression into your codebase without your quality tooling noticing it. The higher the MSI, the more likely it is that your tooling will catch the bug.

In contrast, you might expect from your code line coverage report that any bug introduced in a covered line will be caught by your tests. However, this is not the case, as your tests might not be precise enough to catch the problem.

a copilot for writing better tests

Let’s have a look at a simple example:

if ($x > 0) {
    echo "hello world";
}

To cover this simple code with tests, you can do multiple mistakes:

  • you could assert only the positive case but forget to assert the negative case
    • does the implementation produces the correct output depending on the condition?
    • do your tests assert expectations when the condition is not met?
  • you could add tests which do not properly cover the boundary of the x > 0 expression, meaning off-by-one errors will not be detected
    • e.g. you need to verify whether it should be $x >= 0 or $x > 1 or $x < 0 etc.

Running infection will give you examples (escaped mutants) to give some inspiration what your test-suite does not cover properly. See the above example in the Infection playground in action.

For instance, the following escaped mutant tells you, that your tests do not make a difference whether the condition is $x > 0 or $x >= 0:

Infection playground

This does not necessarily mean that your implementation is wrong, but it tells you that your tests do not cover the condition properly. This might also be a indicator that your assertions need to be more precise, or that you need to add additional tests to cover the edge cases.

dead code detection

From a different perspective, looking at mutation testing results (escaped mutants) can help you to detect dead code. In case you are confident that your test suite covers all relevant cases, a escaped mutant tells you that certain code in your implementation doesn’t make a difference for the end result. This means that the code is dead and can be removed.

Image

See the above example in the Infection playground in action.

Summary

Playing 2 months with Infection was a lot of fun. It made me curious what else I can do to improve the tool and which value it can provide to the PHP community and my projects.

I think it is not that easy to start with running Infection on a existing project, therefore I am planning to write another article about how to get started with it.

]]>
Markus Staab
PHPStan remembered types from constructor2025-04-15T00:00:00+00:002025-04-15T00:00:00+00:00https://staabm.github.io/2025/04/15/phpstan-remember-constructor-typesOver the last few days I am working on a new PHPStan capability, which allows PHPStan to use type information from analyzing a class-constructor with the goal to improve results when later on analyzing instance methods or property hook bodies.

This feature will be available starting with PHPStan 2.1.12. In case you are curious you can play with it in the PHPStan Playground right now. The implementation itself can be inspected in pull request #3930.

Let’s have a look into a few example use-cases for this new feature:

Remember class_exists(), function_exists()

Checking class- or function existence in the constructor in combination with an aborting expression which prevents object creation, will prevent errors like Function some_unknown_function not found in instance methods. The same is true for class_exists.

This means you no longer need to wrap a call to a conditionally defined function in a function_exists block everytime you use it. PHPStan will remember for the whole class that the function will exist, when analyzing instance methods or property hook bodies.

Static methods will still emit a Function some_unknown_function not found error, as these can still be called even if the constructor failed to create an object.

class User
{
   public function __construct() {
      if (!function_exists('some_unknown_function')) {
         throw new \LogicException();
      }
   }

   public function doFoo(): void
   {
      some_unknown_function();
   }
}

Remember global constants

Similar to the example above its possible to check for the existence of a global constant in the constructor. What also comes in handy is narrowing the global constant type will also be preserved for the whole class.

class HelloWorld
{
   public function __construct()
   {
      if (!defined('REMEMBERED_FOO')) {
         throw new LogicException();
      }
      if (!is_string(REMEMBERED_FOO)) {
         throw new LogicException();
      }
   }

   static public function staticFoo2(): void
   {
      // error, static method types are not narrowed via constructor
      echo REMEMBERED_FOO;
   }

   public function returnFoo2(): int
   {
      // error, as the constant was narrowed to string
      return REMEMBERED_FOO;
   }
}

Remember class property types

With all the required machinery in place we went one step further and also remember type information about class properties.

readonly property types

When properties are declared readonly PHPStan is now able to remember all possible types assigned in the constructor. You no longer need to declare a narrow phpdoc type in this case to make PHPStan aware of the concrete values.

class User
{
   public string $name {
      get {
         // previously we only knew $this->type is `int`.
         // new: we know the type of $this->type is `1|2`
         return $this->name . $this->type;
      }
   }

   private readonly int $type;

   public function __construct(string $name) {
      $this->name = $name;
      if (rand(0,1)) {
         $this->type = 1;
      } else {
         $this->type = 2;
      }
   }
}

typed properties and the uninitialized state

Until now PHPStan didn’t know which properties have been initialized in the constructor. Thanks to the recent additions the analysis of instance methods is now aware which properties can no longer be in the uninitialized state, because they have been initialized already.

With this knowledge we are able to tell whether isset(), empty() or ?? is redundant.

class User
{
   private string $string;

   public function __construct()
   {
      if (rand(0, 1)) {
          $this->string = 'world';
      } else {
          $this->string = 'world 2';
      }
   }

   public function doFoo(): void
   {
      // Property User::$string in isset() is not nullable nor uninitialized.
      if (isset($this->string)) {
         echo $this->string;
      }
   }

   public function doBar(): void
   {
      // Property User::$string on left side of ?? is not nullable nor uninitialized.
      echo $this->string ?? 'default';
   }
}

Related PHPStan issues:


Do you like PHPStan and use it every day? Consider sponsoring my open source work.

]]>
Markus Staab
Thank You2025-01-24T00:00:00+00:002025-01-24T00:00:00+00:00https://staabm.github.io/2025/01/24/thank-youHey supporter :).

First let me say thank you. It means at lot to me that people like you see value in my open source work.

Behind every bullet point with my name on it in release notes of PHPStan, PHPUnit, Rector is at least one person with a problem. I wouldn’t be able to dive into all the necessary details to figure them out and contribute improvements, without your support.

Over the course of 2024 I was able to fix ~69 reported problems while contributing ~235 pull requests to PHPStan alone. This work is necessary to make the enduser experience as frictionless as possible. PHPStan gets a better understanding of your code, provides less false positives and is able to point out problems it wouldn’t recognize otherwise. When not working on features I have an eye on performance bottlenecks to make the process more efficient and you waste less time waiting.

I can share similar stories when working on PHPUnit or Rector. Iron out problems is the magic sauce which keep the PHP open source ecosystem alive and useful.

If you got curious, please have a look at my 2024 contribution summary.

Besides all the implementation work, I also had the pleasure to meet with the people behind these projects in person in 2024 for the very first time. Meeting in person with people you have worked with for multiple years remote only is always exciting. Thanks to your financial support I was able to attend the PHP Days in Dresden to meet Ondrej Mirtes (PHPStan inventor), and later on I had a PHPUnit Codesprint with Sebastian Bergmann (PHPUnit inventor) in Munich.

Coming together and spending some time together also helped to improve our teamwork for future collaborations.

For 2025 there is still a lot of stuff todo and I don’t plan to stop contributing to the mentioned projects. I have plans to join a few PHP conferences/meetups and would love to see you there. Have a look at my blog to get the latest news about my open source work and please spread the word.

If you need a helping hand with one of those tools with your projects, feel free to get in contact. In addition, I am still looking for more sponsors - especially from companies which use above mentioned tools. I would love to reduce a few hours a week of my primary job and push the PHP open source ecosystem onto the next level.

Thanks again!

Markus

]]>
Markus Staab
Contribution Summary 20242024-12-11T00:00:00+00:002024-12-11T00:00:00+00:00https://staabm.github.io/2024/12/11/contribution-summary-20242024 marks the year in which Tomas Votruba (Rector) and Ondřej Mirtes (PHPStan) started sponsoring my open source work, which means a lot to me 🥰.

Another outstanding moment this year was the addition of my staabm/side-effects-detector library in PHPUnit.

Let’s have a look at my contribution highlights of 547 merged pull requests across 80 open-source projects:

In a recent blog post you can in addition find my current plan for the upcoming months. I am still looking for financial support for this effort.

The following table shows the distribution of freetime contributions across the different projects I am working on.

project merged pull requests addressed issues
phpstan/phpstan* ~234 (~116 in 2023) 69 (33 in 2023)
rector/rector* ~31 (~178 in 2023) 2
staabm/phpstandba 38 (~44 in 2023) 8
sebastianbergmann/phpunit 37 -
staabm/phpstan-todo-by 35 (~33 in 2023) 7
TomasVotruba/unused-public 28 1
staabm/phpstan-baseline-analysis 22 -
Roave/BetterReflection 12 -
composer/* 12 -
FriendsOfREDAXO/rexstan 11 -
easy-coding-standard/easy-coding-standard 9 1
php/doc-en 7 -
thecodingmachine/safe 8 -
sebastianbergmann/exporter 5 -
nikic/PHP-Parser 5 -
PHP-CS-Fixer/PHP-CS-Fixer 4 -
easy-coding-standard/easy-coding-standard 4 -
symfony/symfony 3 -
TomasVotruba/class-leak 3 -
TomasVotruba/type-coverage 2 -
infection/infection 2 -
shipmonk-rnd/dead-code-detector 2 -
TomasVotruba/cognitive-complexity 1 -
vimeo/psalm 1 1
symfony/symfony 1 -
larastan/larastan 1 -
… a lot more - -

numbers crunched with staabm/oss-contribs

I am pretty happy with contributing to so many popular and important projects of the PHP ecosystem. If you look closely you can easily see my focus on quality assurance and static analysis tools :-).

Thank you

I want to say thank you to everyone supporting my efforts. Whether you sponsor my time or you invest your own time to review my pull requests or maintain one of the above projects.

I couldn’t deliver so much value in the PHP ecosystem without you 🥰.

git-wrapped.com stats for @staabm for 2024

2025 is just around the corner

I wish you all the best for the upcoming year. I am looking forward to continue my open source work and I hope you will support me in doing so.

If one of those open source projects is critical for your business, please consider supporting my work with your sponsoring 💕

]]>
Markus Staab
My new PHPStan focus: multi-phpversion support2024-11-28T00:00:00+00:002024-11-28T00:00:00+00:00https://staabm.github.io/2024/11/28/phpstan-php-version-in-scopeTL;DR: What’s already done?

In a recent article I was summarizing the problems and results of my work on mixed types in PHPStan. Now we will have a look at what comes next.

For a few years, I am now contributing to PHPStan with a focus on improving the type inference, which means I am looking into code where mixed types are involved and how I can improve the situation.

In my opinion we are in a pretty good mixed type shape, as the most common problems I can think of seem to be addressed. For sure new examples will show up, and we still will and have to continue to improve the situation. I am no longer prioritizing mixed problems over other things in my PHPStan work, though.

Problem space

So what’s ahead? My new focus area will be improving the PHPStan story around multi-phpversion supporting code. This means focusing on stuff which is different between PHP versions and tasks/hurdles common to projects which are in the process of a PHP version upgrade.

If you want to cover your codebase cross several PHP versions, you need to set up a CI matrix with different PHP versions. You also need multiple PHPStan baselines to ignore errors which are only relevant for a specific PHP version. Such a setup brings additional complexity not everyone is willing to deal with.

In my experience most projects set up PHPStan only for a single PHP version and ignore the rest, which leaves a lot of potential errors undetected.

Another challenge you face over and over when upgrading PHP versions is the resource to objects migration. There are articles on the web on this problem alone. Different PHP versions use different types for some APIs - e.g. curl_init or socket_create, to name a few - and as soon as you are planning a PHP upgrade you usually need to deal with supporting both signatures - resource and the corresponding object-types - in tandem for a while, so can run your application on your current and your future production system at the same time.

The topic gets even more complicated in case you are building a tool, library or a framework as you usually need to support multiple PHP version for a longer time. You also need to handle phasing out and adding support for new PHP versions to your compatibility matrix over and over, which means you constantly need to answer questions like:

  • which code is going to be dead because of a min-php version raise?
  • which code needs adjustments to support a new PHP version?
  • how can we make sure that code which gets adapted for the new PHP version still works on the old PHP version?
  • do we have a rough idea how many problems we need to solve?

To help you answer this questions my goals are:

  • Projects which can only afford a single PHPStan CI job should detect as many cross-php version related errors as possible
  • Running PHPStan on multiple PHP versions should be as frictionless as possible

How can you support my effort?

I think working on this thing will be a multi month freetime effort and will at least take several dozens of pull requests.

If you are hit by at least one of the problems I described above and feel the pain you should talk to your boss to sponsor my free time efforts, so I can spend more time on it, and you have less problems to deal with in your daily job.

Your task to upgrade your employers codebases to PHP 8.4 may be already in the pipeline :-).

What’s next?

The current plan is to make PHPStan aware of a narrowed PHP-Version within the current scope and utilize this information in type inference and error reporting. This means while analyzing code we no longer just use a fixed PHP version configured in e.g. PHPStan NEON configuration, but also narrow it further down based on the code at hand. Nearly all rules in the PHPStan core and 1st party extensions need to be adjusted.

Let me give you a few examples which currently don’t work well, but should work much better after the project evolves:

At the time of writing PHPStan 2.0.2 will report null coalescing errors in your code only if you narrow down the PHP version by configuration. This means you define the PHP version or version-range by NEON config, composer.json (as of PHPStan 2+) or implicitly by the PHP runtime version you are using for PHPStan.

Running the below example without additional configuration on PHP8 only will not yield any errors. As of now you would need e.g. a separate CI job configured for PHP 7.3 or lower to catch the error. In the future, I want PHPStan catch this error even when running on PHP8 or later and without additional configuration required:


if (PHP_VERSION_ID < 70400) {
    // should error about null coalescing assign operator,
    // which requires PHP 7.4+
    $y['y'] ??= [];
} else {
    $y['x'] ??= [];
}

Another example: PHPStan is using a single knowledge base for return and parameter types of functions and methods. This information is narrowed down by PHPStan Extensions when e.g. parameter values are known at static analysis time. In the future I want to improve the type inference e.g. for cases where PHP used resource types in the past, but uses class/object types in more modern versions:


class MySocket
{
  public function create(): ?Socket
  {
    if (PHP_VERSION_ID < 80000) {
        throw new RuntimeException('PHP 8.0 required');
    }

    // can only return `\Socket|false` but PHPStan sometimes
    // mixes it up with PHP7 `resource` type
    return socket_create(AF_INET, SOCK_DGRAM, SOL_UDP) ?: null;
  }
}

There are a lot of other problem areas, for which you see the errors only when PHPStan is configured with certain PHP versions:

  • named arguments
  • parameter contravariance
  • return type covariance
  • non-capturing exception catches
  • native union types
  • several deprecated features around how php-src handles parameters
  • class constants
  • legacy constructors
  • parameter type widening
  • unset cast
  • multibyte string handling functions
  • readonly properties
  • readonly classes
  • enums
  • intersection types
  • tentative return types
  • array unpacking
  • dynamic properties
  • constants in traits
  • php native attributes
  • implicit parameter nullability

What you just read about is the result of my initial research. I am pretty sure we will shape new ideas after iterating on the problems involved and the solutions we come up with.

I will work through the mentioned problem areas one after another, which also means your developer experience when using PHP version specific language features with PHPStan should improve over time, release after release.

Do these problems sound relevant to you? Please spread the word about my free time project and retoot on Mastodon or retweet on Twitter/X.

What’s already done?

this chapter will be updated to reflect the ongoing progress

Narrow types by PHP_VERSION_ID

The first step in this direction was already achieved by making PHPStan aware of composer.json defined PHP version requirements and taking this knowledge into account to narrow constants like PHP_VERSION_ID et. all. since PHPStan 2.0.

There is a dedicated blog post about this topic already: PHPStan PHP Version Narrowing

Report deprecations in ini_*() functions

At the time of writing there are ~20 deprecated php.ini options. A new PHPStan rule was implemented which reports usages of ini_*() functions which use a deprecated option:

<?php declare(strict_types = 1);

// new error:
// Call to function ini_get() with deprecated option 'assert.active'.
var_dump(ini_get('assert.active'));

PHPStan playground snippet

Scope narrowed PHP Version

Starting with PHPStan 2.0.3 the php-version in will be narrowed down and builtin rules have been adjusted to report errors in more cases:

For more details see the above background story.

]]>
Markus Staab
A mixed type PHPStan journey2024-11-26T00:00:00+00:002024-11-26T00:00:00+00:00https://staabm.github.io/2024/11/26/phpstan-mixed-typesA mixed typed represents the absence of type information. It is a union of all types, which means it can be anything. This in turn leads to suboptimal PHPStan analysis results which can lead to missing errors or even false positives.

For a few years, I am now contributing to PHPStan with a focus on improving the type inference, which means looking into code where mixed types are involved and figure out how the situation can be improved.

I will start to focus on a different PHPStan area soon, so I thought it would be a good time to summarize the achievements made.

In this article I want to share the most meaningful contributions to PHPStan core, but also look at PHPStan extensions work which was helpful along the way.

Narrow types from if-conditions

What most situations we are looking at have in common is, that we have very little information at the beginning. One useful tool to handle that is a subtractable type, which PHPStan is using for mixed for a long time already. This means we don’t describe what we know about a type, but instead we narrow it down by what we know it is not:


function doFoo($mixed) {
  if ($mixed) {
    // $mixed can be anything but a falsey type
    \PHPStan\dumpType($mixed); // mixed~(0|0.0|''|'0'|array{}|false|null)
  }

  if (!$mixed) {
    // $mixed can be anything but a truethy type
    \PHPStan\dumpType($mixed); // 0|0.0|''|'0'|array{}|false|null
  }
}

The subtractable type narrowing is used in PHPStan strict-comparisons (e.g. === or !==) and for some very specific but often used code patterns already. I was looking at cases where it was still missing like, type-casts in conditions:


class Test {

  private ?string $param;

  function show() : void {
    if ((int) $this->param) {
      \PHPStan\dumpType($this->param); // string
    } elseif ($this->param) {
      \PHPStan\dumpType($this->param); // non-falsy-string
    }
  }

  function show2() : void {
    if ((float) $this->param) {
      \PHPStan\dumpType($this->param); // string|null
    } elseif ($this->param) {
      \PHPStan\dumpType($this->param); // non-falsy-string
    }
  }

  function show3() : void {
    if ((bool) $this->param) {
      \PHPStan\dumpType($this->param); // non-falsy-string
    } elseif ($this->param) { // Elseif condition is always false.
      \PHPStan\dumpType($this->param); // *NEVER*
    }
  }

  function show4() : void {
    if ((string) $this->param) {
      \PHPStan\dumpType($this->param); // non-empty-string
    } elseif ($this->param) { // Elseif condition is always false.
      \PHPStan\dumpType($this->param); // *NEVER*
    }
  }
}

Next I was looking into how a cast on already subtracted mixed types influences the results:


/**
 * @param int|0.0|''|'0'|array{}|false|null $moreThenFalsy
 */
function subtract(mixed $m, $moreThenFalsy) {
  if ($m !== true) {
    assertType("mixed~true", $m);
    assertType('bool', (bool) $m); // mixed could still contain something truthy
  }
  if ($m !== false) {
    assertType("mixed~false", $m);
    assertType('bool', (bool) $m); // mixed could still contain something falsy
  }
  if (!is_bool($m)) {
    assertType('mixed~bool', $m);
    assertType('bool', (bool) $m);
  }
  if (!is_array($m)) {
    assertType('mixed~array<mixed, mixed>', $m);
    assertType('bool', (bool) $m);
  }

  if ($m) {
    assertType("mixed~(0|0.0|''|'0'|array{}|false|null)", $m);
    assertType('true', (bool) $m);
  }
  if (!$m) {
    assertType("0|0.0|''|'0'|array{}|false|null", $m);
    assertType('false', (bool) $m);
  }
  if (!$m) {
    if (!is_int($m)) {
      assertType("0.0|''|'0'|array{}|false|null", $m);
      assertType('false', (bool)$m);
    }
    if (!is_bool($m)) {
      assertType("0|0.0|''|'0'|array{}|null", $m);
      assertType('false', (bool)$m);
    }
  }

  if (!$m || is_int($m)) {
    assertType("0.0|''|'0'|array{}|int|false|null", $m);
    assertType('bool', (bool) $m);
  }

  if ($m !== $moreThenFalsy) {
    assertType('mixed', $m);
    assertType('bool', (bool) $m); // could be true
  }

  if ($m != 0 && !is_array($m) && $m != null && !is_object($m)) { // subtract more types then falsy
    assertType("mixed~(0|0.0|''|'0'|array<mixed, mixed>|object|false|null)", $m);
    assertType('true', (bool) $m);
  }
}

A case where we did not properly narrow types was in comparisons with strlen() and integer-range types:


function narrowString(string $s) {
  $i = range(1, 5);
  if (strlen($s) == $i) {
    \PHPStan\dumpType($s); // non-empty-string
  }

  $i = range(2, 5);
  if (strlen($s) == $i) {
    \PHPStan\dumpType($s); // non-falsey-string
  }
}

This also works in a similar fashion when comparing the results of substr():


/**
 * @param non-empty-string $nonES
 * @param non-falsy-string $falsyString
 */
public function stringTypes(string $s, $nonES, $falsyString): void
{
  if (substr($s, 10) === $nonES) {
    assertType('non-empty-string', $s);
  }

  if (substr($s, 10) === $falsyString) {
    assertType('non-falsy-string', $s);
  }
}

A pretty complex field was to think about what isset($array[$key]) means for the type of $key:


/**
 * @param array<int, string> $intKeyedArr
 * @param array<string, string> $stringKeyedArr
 */
function narrowKey($mixed, string $s, int $i, array $generalArr, array $intKeyedArr, array $stringKeyedArr): void {
  if (isset($generalArr[$mixed])) {
    assertType('mixed~(array|object|resource)', $mixed);
  } else {
    assertType('mixed', $mixed);
  }
  assertType('mixed', $mixed);

  if (isset($generalArr[$i])) {
    assertType('int', $i);
  } else {
    assertType('int', $i);
  }
  assertType('int', $i);

  if (isset($intKeyedArr[$mixed])) {
    assertType('mixed~(array|object|resource)', $mixed);
  } else {
    assertType('mixed', $mixed);
  }
  assertType('mixed', $mixed);

  if (isset($intKeyedArr[$s])) {
    assertType("lowercase-string&numeric-string&uppercase-string", $s);
  } else {
    assertType('string', $s);
  }
  assertType('string', $s);

  if (isset($stringKeyedArr[$mixed])) {
    assertType('mixed~(array|object|resource)', $mixed);
  } else {
    assertType('mixed', $mixed);
  }
  assertType('mixed', $mixed);
}

function emptyString($mixed)
{
  // see https://3v4l.org/XHZdr
  $arr = ['' => 1, 'a' => 2];
  if (isset($arr[$mixed])) {
    assertType("''|'a'|null", $mixed);
  } else {
    assertType('mixed', $mixed); // could be mixed~(''|'a'|null)
  }
  assertType('mixed', $mixed);
}

function numericString($mixed, int $i, string $s)
{
  $arr = ['1' => 1, '2' => 2];
  if (isset($arr[$mixed])) {
    assertType("1|2|'1'|'2'|float|true", $mixed);
  } else {
    assertType('mixed', $mixed);
  }
  assertType('mixed', $mixed);
}

function arrayAccess(\ArrayAccess $arr, $mixed) {
  if (isset($arr[$mixed])) {
    assertType("mixed", $mixed);
  } else {
    assertType('mixed', $mixed);
  }
  assertType('mixed', $mixed);
}

Immediate-invoked-function-expression (IIFE)

A pattern know from javascript projects and sometimes also popping up in PHP code is immediately-invoked-function-expressions. Type inference improvements for this pattern in particular was implemented to support TwigStan:


/** @param array{date: DateTime} $c */
function main(mixed $c): void{
  assertType('array{date: DateTime}', $c);
  $c['id']=1;
  assertType('array{date: DateTime, id: 1}', $c);

  $x = (function() use (&$c) {
    assertType("array{date: DateTime, id: 1}", $c);
    $c['name'] = 'ruud';
    assertType("array{date: DateTime, id: 1, name: 'ruud'}", $c);
    return 'x';
  })();

  assertType("array{date: DateTime, id: 1, name: 'ruud'}", $c);
}


/** @param array{date: DateTime} $c */
function main2(mixed $c): void{
  assertType('array{date: DateTime}', $c);
  $c['id']=1;
  $c['name'] = 'staabm';
  assertType("array{date: DateTime, id: 1, name: 'staabm'}", $c);

  $x = (function() use (&$c) {
    assertType("array{date: DateTime, id: 1, name: 'staabm'}", $c);
    $c['name'] = 'ruud';
    assertType("array{date: DateTime, id: 1, name: 'ruud'}", $c);
    return 'x';
  })();

  assertType("array{date: DateTime, id: 1, name: 'ruud'}", $c);
}

New PHPStan doc-types

To express types better a few phpdoc improvements have been implemented

Whats great about new phpdoc types is, that we can utilize them in stubs shipped with PHPStan releases, but they can also be used in any userland php codebase to help improve static analysis results.

If PHPStan would infer all this information from the source it would be a lot slower as it is right now.

By adding doc-types you also give information/semantic to the code and tell about your intents. This is something not only helpful for the tooling but also developers reading your implementation.

That way PHPStan can tell you whether expectations are right or whether you are lying :-).

New PHPStan Extension types

Using ParameterOutTypeExtensions by-reference parameters can be programmatically and context-sensitively narrowed for functions/methods since PHPStan 1.11.6. This was later on used to improve by-reference parameter type inference after calls to parse_str and preg_match*.

Utilizing information outside the PHP Source

A different take was used to improve type inference of the $matches by-ref parameter of preg_match() based on a REGEX abstract syntax tree. It’s a complex story on its own with a dedicated array shape match inference article.

Similar improvements landed for the printf() family of functions, see PHPStan sprintf/sscanf type inference.

Last but not least a PHPStan extension named phpstan-dba was created which introspects the database schema to implement type inference for the database access layer via SQL abstract syntax tree. This is covered by a series of blog posts.

After one focus area is before the next

This article highlighted only a few of the many contributions I made to PHPStan in the last years.

A big thank-you goes out to all my sponsors and supporters, who make it possible for me to work on PHPStan and other open-source projects.

While closing this type inference focus chapter, I am looking forward to the next challenges. What comes up next will be the topic of a future blog post.

stay tuned ⚡️

]]>
Markus Staab
PHPStan performance on different hardware2024-11-17T00:00:00+00:002024-11-17T00:00:00+00:00https://staabm.github.io/2024/11/17/phpstan-performance-on-different-hardwareI recently updated my macbook to the newly released model. The idea was to get a faster feedback loop when working on complex stuff like PHPStan core itself.

How much impact has hardware on PHPStan performance?

Let’s find out what we can expect when running different hardware.

For the sake of this article we compare running PHPStan@cc4eb92. We are running time make from within the terminal/console and the project root 5 times. We do so after a fresh boot-up and without any other applications running. These numbers are not scientific, but give us a rough idea.

Running macOS

no opcache
$ php -v
PHP 8.3.13 (cli) (built: Oct 22 2024 18:39:14) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.3.13, Copyright (c) Zend Technologies

Apple MacBook M1 Pro (2021), 10‑Core CPU, 1 TB SSD, 32 GB RAM (on full battery)

  • 77-85 seconds

Apple MacBook M4 Pro (2024), 14‑Core CPU, 1 TB SSD, 48 GB RAM (on full battery)

  • 57-59 seconds

-> In my experience the performance of “on battery” vs. “plugged in” is not that different on a MacBook.

with opcache
$ php -v
PHP 8.3.13 (cli) (built: Oct 22 2024 18:39:14) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.3.13, Copyright (c) Zend Technologies
    with Zend OPcache v8.3.13, Copyright (c), by Zend Technologies

MacBook Air M1 (2020), 8‑Core CPU, 16 GB RAM

  • 128-140 seconds

Apple MacBook M1 Max, 10‑Core CPU, 32 GB RAM

  • 61-62 seconds

Apple MacBook M2 Pro (2023), 12‑Core CPU, 1 TB SSD, 16 GB RAM (plugged in)

  • 76-82 seconds

Apple MacBook M2 Pro (2023), 12‑Core CPU, 1 TB SSD, 16 GB RAM (on full battery)

  • 75-85 seconds

Apple MacBook M4 Max (2024), 16‑Core CPU, 1 TB SSD, 64 GB RAM (plugged in)

  • 59-62 seconds

-> opcache on the CLI does not affect the app, as long as you don’t use filebased caching.

Running windows11x64 23H2

$ php -v
PHP 8.2.12 (cli) (built: Oct 24 2023 21:15:35) (NTS Visual C++ 2019 x64)
Copyright (c) The PHP Group
Zend Engine v4.2.12, Copyright (c) Zend Technologie

Lenovo Thinkpad P1 Gen 5, Intel Core i9-12900H, 1 TB SSD, 32 GB RAM (on full battery) Microsoft defender XDR hardened

  • 115-120 seconds

Lenovo Thinkpad P1 Gen 5, Intel Core i9-12900H, 1 TB SSD, 32 GB RAM (plugged in) Microsoft defender XDR hardened

  • 110-120 seconds

-> In my experience the performance of “on battery” vs. “plugged in” is marginal different on a Thinkpad.

Closer look at Apple MacBook M4 Pro (2024), 14‑Core CPU, 1 TB SSD, 48 GB RAM

$ php -v
PHP 8.3.13 (cli) (built: Oct 22 2024 18:39:14) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.3.13, Copyright (c) Zend Technologies

Still on PHPStan@cc4eb92 but looking at the separate make targets:

time make tests

  • 37-45 seconds

time make phpstan

  • 18-19 seconds

M4 Pro (2024), 14‑Core CPU, 1 TB SSD, 48 GB RAM with Microsoft Defender enabled

time make tests

  • 41-49 seconds

time make phpstan

  • 19-21 seconds
]]>
Markus Staab