Purpose, syntax and examples of generators in php (yield)

Lecture



Purpose and syntax of a generator in php

yield - return from a function while maintaining the current iteration state of this function.
When requesting the next value in a loop of an iterative loop, call this function again using the saved state, i.e. continuation of data processing from the place of return within this function.

You can draw some analogy with the "cursor" in databases and the place of data processing at the cursor position.

The generator as a whole looks like a regular function, except that instead of returning a single value, the generator will iterate over as many values ​​as necessary. Any function that contains yield is a generator function.

When a generator is called, it returns an object that can be iterated over. When you iterate over this object (for example, in a foreach loop), PHP calls the object's iteration methods every time you need a new value, then saves the state of the generator and returns the next value on the next call.

Purpose, syntax and examples of generators in php (yield)

The result of this example:

1 2 3 

When all the values ​​in the generator have run out, the generator will simply exit without returning anything. After that, the main code will continue to work as if the array ran out of elements to iterate over.

The whole point of a generator is the yield keyword. In its simplest form, the "yield" statement can be thought of as a "return" statement, except that instead of terminating the function, "yield" only pauses its execution and returns the current value, and the next time the function is called, it will resume execution from where it was. which was interrupted.

Delegating a generator with yield from

In PHP 7, generator delegation allows you to get values ​​from another generator, Traversable object, or array using yield from. The external generator will return values ​​from the internal generator, object or array until they return them, after which the execution of the external generator will continue.

If a generator is used with yield from, then the yield from expression will also return values ​​from the internal generator.

Purpose, syntax and examples of generators in php (yield)

The result of this example:

1 2 3 4 5 6 7 8 9 10 

Using iterators

Iterating over a large dataset

Suppose you have a large set of goods and you need to select those that are delivered from a certain warehouse.

Without using iterators, it would be like this:

 get ();
    $ filteredGoods = [];

    foreach ($ goods as $ product) {
            $ filteredGoods [] = $ this-> makeEntity ($ product); 
    }

    return $ filteredGoods;
}

The problem is easy to see: the more products, the more memory is needed for $ filteredGoods.

One solution is to create an iterator that iterates over $ goods and returns something without using the intermediate $ filteredGoods array. The site https://intellect.icu says about it. However, this is not quick to do from a coding standpoint.

Alternatively, it is possible to use generators in php instead of an interactor.



private function getGoodsByStore ($ storeId) { $ goods = Goods :: where ('store_is', storeId) -> get (); foreach ($ goods as $ product) { yield $ this-> makeEntity ($ product); } }
 Refactoring   the getGoodsByStore method to use a generator is very simple: replace the passing of values ​​to the $ filteredGoods variable and return with the yield construct.

Because there is no $ filteredGoods array, but there is an iterator, or a generator, memory consumption will now be constant, no matter how many goods need to be returned, and we are sure that goods will be selected only when needed.

Aggregating multiple data sources

Now let's look at the moment of receiving $ goods. Let's assume that you can get them from different sources: a relational database and Elasticsearch.

We can write a simple method that aggregates these two sources to start without using generators:



$ products = [];

    // from DB  
    $ goods = Goods :: where ('store_is', storeId) -> get ();
    foreach ($ goods as $ product) {
        $ products [] =   $ this -> makeEntity ($ product);
    }
    // from Elasticsearch   
    $ cursor = $ this -> esClient-> findAll ();

    foreach ($ cursor as $ data) {
        $ products [] = = $ this -> makeEntity ($ data);
    }

    return $ products;
}

Note that the amount of memory consumed when using this method is very dependent on the number of products received from the database and Elasticsearch.

Here you can use generators and return the result:



// from DB 
    $ goods = Goods :: where ('store_is', storeId) -> get () -> toArray ();
    foreach ($ goods as $ product) {
         yield  $ this -> makeEntity ($ product);
    }
    // from Elasticsearch   
    $ cursor = $ this -> esClient-> findAll ();

    foreach ($ cursor as $ data) {
        yield  $ this -> makeEntity ($ data);
    } 
}

This is certainly better, but we have another problem: the getGoods method does multiple responsibilities!

We have to separate two responsibilities (getting data from the database and calling Elasticsearch) into two methods:

 getGoodsFromDB ();
    yield from $ this -> getGoodsFromES ();
}

private  function getGoodsFromDB ()
{
    $ goods = Goods :: where ('store_is', storeId) -> get () -> toArray ();
    foreach ($ goods as $ product) {
         yield  $ this -> makeEntity ($ product);
    }
}

private  function getGoodsFromES ()
{
    $ cursor = $ this -> esClient-> findAll ();
    foreach ($ cursor as $ data) {
       yield  $ this -> makeEntity ($ data);
    }
 }

Note the use of the yield from operator (available since php 7.0) which allows you to delegate the use of generators.

This is ideal, for example, for aggregating multiple data sources that use generators.

The yield from operator works on any Traversable object, so arrays and iterators can also be used with this operator.

Using this construction, we can aggregate multiple data sources with a couple of lines of code:

 getEbooksFromFile ();
    yield from $ this -> getEbooksFromDB ();
}

Simulating asynchronous tasks

Last but not least, generators can also be used to simulate asynchronous tasks. While writing this post, I came across @nikita_ppv's post on the same topic, and since he was the first to implement generators in php, I'll just leave a link to his post.

He quickly explains what generators are and (in detail) how we can benefit from the fact that they can be interrupted and send / receive data to implement coroutines and even multitasking.

Thus

  • Generators can be used as lightweight iterators;
  • Generators can return unlimited amounts of data without additional memory consumption;
  • Generators can be aggregated using generator delegation;
  • Generators can be used to implement multitasking;

Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Scripting client side JavaScript, jqvery, BackBone

Terms: Scripting client side JavaScript, jqvery, BackBone