# Tugging products from a DB with weighted chance

Let us say I'd a table filled with records which i desired to pull random records from. However, I would like certain rows for the reason that table to look more frequently than the others (and which of them vary by user). What's the easiest method to build a storage shed, using SQL?

The only method I'm able to think about is to produce a temporary table, grow it using the rows I wish to become more common, after which pad it along with other at random selected rows in the table. It is possible to better way?

One of the ways I'm able to think about would be to create another column within the table the industry moving amount of your weights, then pull your records by producing a random number between and also the total of your weights, and pull the row using the greatest moving sum value under the random number.

For instance, should you have had four rows using the following weights:

``````+---+--------+------------+
|row| weight | rollingsum |
+---+--------+------------+
| a |      3 |          3 |
| b |      3 |          6 |
| c |      4 |         10 |
| d |      1 |         11 |
+---+--------+------------+
``````

Then, select a random number `n` between and 11, inclusive, and return row `a` if `0<=n<3`, `b` if `3<=n<6`, and so forth.

Here are a few links on producing moving sums:

http://dev.mysql.com/tech-resources/articles/rolling_sums_in_mysql.html

http://dev.mysql.com/tech-resources/articles/rolling_sums_in_mysql_followup.html

I'm not sure that it is possible effortlessly with SQL alone. With T-SQL or similar, you can write a loop to copy rows, or make use of the SQL to create the instructions for doing the row duplication rather.

I'm not sure your probability model, but you could utilize a strategy such as this to offer the latter. Given these table definitions:

``````RowSource
---------
RowID

UserRowProbability
------------------
UserId
RowId
FrequencyMultiplier
``````

You can write a question such as this (SQL Server specific):

``````SELECT TOP 100 rs.RowId, urp.FrequencyMultiplier
FROM RowSource rs
ORDER BY ISNULL(urp.FrequencyMultiplier, 1) DESC, NEWID()
``````

This could take proper care of choosing a random group of rows in addition to the number of ought to be repeated. Then, inside your application logic, you could do this the row duplication and shuffle the outcomes.

Begin with 3 tables customers, data and user-data. User-data consists of which rows ought to be prefered for every user.

Then create one view in line with the data rows which are prefered through the the consumer.

Produce a second view which has the none prefered data.

Produce a third view the industry union from the first 2. The union should choose more rows in the prefered data.

Then finally choose random rows in the third view.