Get Random rows in Doctrine

Reading Time 2 min

Every Php Symfony developer knows the struggle of getting random rows/records via Doctrine. This is because there is no native solution present and quite unfortunately, the doctrine team isn’t even willing to implement this feature! (Reference)

doctrine random rows

So how could we get random rows/objects? Lets go over the possible options, each having its advantages & disadvantages :

1. Add a customized Numeric Function:

To do this, define a new class in your project as follows:

namespace AppBundle\Doctrine2\Function;

use Doctrine\ORM\Query\Lexer;

class Rand extends \Doctrine\ORM\Query\AST\Functions\FunctionNode
{

    public function parse(\Doctrine\ORM\Query\Parser $parser)
    {
        $parser->match(Lexer::T_IDENTIFIER);
        $parser->match(Lexer::T_OPEN_PARENTHESIS);
        $parser->match(Lexer::T_CLOSE_PARENTHESIS);
    }

    public function getSql(\Doctrine\ORM\Query\SqlWalker $sqlWalker)
    {
        return 'RAND()';
    }
}

Then register this class in your config.yml

// config.yml
doctrine:
     orm:
         dql:
             numeric_functions:
                 Rand: AppBundle\Doctrine2\Function\Rand

Once this is done, you can use this RAND() function directly in your query as

$query->addSelect('RAND() as HIDDEN rand')->orderBy('rand');

2. Use Native SQL in Doctrine

First, you could issue a Raw SQL query to get random records using MySQL’s ORDER BY RAND(). Then load the associated objects using native SQL in doctrine. This is illustrated below:

// get random ID's using RAW SQL
$sql = 'SELECT id from table ORDER BY RAND() LIMIT 100';
$stmt = $conn->prepare($sql);
$stmt->execute();
$random_ids = array();
while ($val = $stmt->fetch()) {
    $random_ids[] = $val['id'];
}
// native SQL in doctrine to load associated objects
$query = $this->em->createQuery("SELECT tt
                                 FROM AppBundle:table tt
                                 WHERE tt.id in (:ids)")
                  ->setParameter('ids', $random_ids);
$randomEntities = $query->getResult();

This way we can get random records even with complex joins. For faster performance with larger datasets, check out this StackOverflow post (& the original reference) which shows how you could use RAW sql to get back random rows.

3. Use DoctrineExtensions library

Install the library as per the instructions and update your config.yml

// config.yml
doctrine:
     orm:
         dql:
             numeric_functions:
                 rand: DoctrineExtensions\Query\Mysql\Rand

Then you can simply use the function RAND() defined in this library with Query Builder as

 
$query->addSelect('column')->orderBy('RAND()'); 

4. Use Php to get Random Rows/records

When you have only a smaller dataset  (less than 10k records), then you can have those shuffled using Php.

// get all tasks
$tasks = $em->getRepository('Entity\Task')->findAll();
// shuffle records
shuffle($tasks);

This method has the disadvantage that it would be quite inefficient if you have a huge dataset (100k+ records)

BONUS

Here’s a small hack that will give you very good performance at the cost of not-so-random records.

//Get count of total rows
$rows = $em->createQuery('SELECT COUNT(u.id) 
                          FROM AcmeUserBundle:User u')
           ->getSingleScalarResult();
// calculate a random offset
$offset = max(0, rand(0, $rows - $n - 1));
//Get the first $n rows(users) starting from a random point
$query = $em->createQuery('
                SELECT DISTINCT u
                FROM AcmeUserBundle:User u')
            ->setMaxResults($n)
            ->setFirstResult($offset);
$result = $query->getResult(); 

$n number of users object that you retrieve are consecutive (i.e. the i-th, (i+1)-th,…,(i+$n)-th), For use-cases where there is the need of taking one or two entities at random, and not the whole list, this is an effective alternative.

Hope that this helps you guys! 🙂

Leave a Reply