Help with SQL

npalardy · 30 July 2021 20:03

Given a table like

id | text | number
|1|row1|1|
|2|row2|1|
|3|row3|2|
|4|row4|3|
|5|row5|3|
|6|row6|4|
|7|row7|5|

select * from table group by number having count(number) = 1

Eli.Ott · 31 July 2021 06:27

CREATE TABLE `test` (
  `id` int(11) NOT NULL,
  `aText` varchar(255) DEFAULT NULL,
  `aNumber` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
);

INSERT INTO test VALUES (1, 'row1', 1);
INSERT INTO test VALUES (2, 'row2', 1);
INSERT INTO test VALUES (3, 'row3', 2);
INSERT INTO test VALUES (4, 'row4', 3);
INSERT INTO test VALUES (5, 'row5', 3);
INSERT INTO test VALUES (6, 'row6', 4);
INSERT INTO test VALUES (7, 'row7', 1);

SELECT *
  FROM test
 WHERE aNumber IN (
           SELECT aNumber
             FROM test
         GROUP BY aNumber
           HAVING COUNT(aNumber) = 1
       )
;

DaveS · 31 July 2021 14:32

way overkill… Normans answer is by far the simpleist and quickest

Eli.Ott · 31 July 2021 15:04

The request was how to return the rows:

Norman’s answer doesn’t fulfil that.

DaveS · 31 July 2021 15:10

Uh… it most certainly does

CREATE TABLE `test` (
  `id` int(11) NOT NULL,
  `aText` varchar(255) DEFAULT NULL,
  `aNumber` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
);

INSERT INTO test VALUES (1, 'row1', 1);
INSERT INTO test VALUES (2, 'row2', 1);
INSERT INTO test VALUES (3, 'row3', 2);
INSERT INTO test VALUES (4, 'row4', 3);
INSERT INTO test VALUES (5, 'row5', 3);
INSERT INTO test VALUES (6, 'row6', 4);
INSERT INTO test VALUES (7, 'row7', 1);

select * from test group by anumber having count(anumber) = 1

returns

"id","aText","aNumber"
"3","row3","2"
"6","row6","4"

which happens to be the exact same result your more verbose code returned.

Dale · 31 July 2021 16:10

Thanks. I knew there was a way to do it but I just couldn’t get it.

Eli.Ott · 31 July 2021 17:05

This is not correct SQL. You cannot select non-aggregated columns when you have a GROUP BY clause in your query.

Unfortunately MySQL and others – but not all – allow this. And for example when MySQL 5.7 made sql_mode = ONLY_FULL_GROUP_BY the default, people turned it off on their server instead of correcting their queries.

EDIT:

If you ever stumble over an error like this:

Query 1 ERROR: Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column ‘testdb.test.id’ which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

… for example when moving to another MySQL server, then you will now know why.

Alwaysbusy · 31 July 2021 17:58

True. Lost a lot of time in the past on a MySQL server upgrade with an app. You can indeed turn it off, but decided it is better to correct the queries. This is a very simple query, but with more complex ones, you can get in a lot of trouble. Learned my lesson

npalardy · 1 August 2021 00:35

It absolutely does

your query just buries it one level deeper for no benefit

npalardy · 1 August 2021 00:37

Many DB’s permit this non-standard usage
SQLite for instance (where I tested it)

Eli.Ott · 1 August 2021 08:01

There a two problems:

You do not have the guarantee that the result is the correct one. If you do not include all non-aggregated columns in the GROUP-BY-clause, the database engine will choose a value for each non-aggregated column in each returned row. There is a good explanation here: [mysql - What are the benefits of only_full_group_by mode? - Stack Overflow](https://What are the benefits of only_full_group_by mode?)

As mentioned above, you might get into trouble when moving to another server or database system which enforces it (it is in the SQL-92 standard !).

Eli.Ott · 1 August 2021 08:40

Here is an example of what could go wrong with our specific query here:

Assume that you are satisfied with the result, and now you copy paste the query – as it works perfectly – because you want to get the rows where the aNumber field occurs twice, not once, in the table. So you change HAVING count(aNumber) = 1 to HAVING count(aNumber) = 2 :

  SELECT *
    FROM test
GROUP BY aNumber HAVING count(aNumber) = 2;

Now the database engine does not know which row values it should use for the columns id and aText. Both results are possible – and the database engine will chose one for you:

|4|row4|3|     -- or maybe you'll get: |5|row5|3|

Imagine your code uses the value from id or aText of the row the engine chose for you!

If you are absolutely sure that you do not care which values you’ll get in the non-aggregated columns, you can use the ANY_VALUE keyword:

  SELECT ANY_VALUE(id),
         aNumber,
         ANY_VALUE(aText)
    FROM test
GROUP BY aNumber HAVING count(aNumber) = 2;

… and you’ll also get:

|4|row4|3|     -- or maybe: |5|row5|3|

… despite ONLY_FULL_GROUP_BY being on. Again I just don’t know what you could do with the contents of the id and the aText column since the result is incomplete.

The correct way would be this:

SELECT *
  FROM test
 WHERE aNumber IN (
           SELECT aNumber
             FROM test
         GROUP BY aNumber HAVING count(aNumber) = 2
)

… and you’ll get:

|4|row4|3|
|5|row5|3|

… and you are now safe to loop over the result and use the values from id and aText.

JeffT · 1 August 2021 08:42

select * from test group by anumber having count(anumber) = 1

I know this wont work in Oracle or SQL server because anumber is not explicitly mentioned in the output fields

There, you might use

select * from test  where anumber in
(select anumber from test group by anumber having count(anumber) = 1);

…which is what Eli.ott said earlier, of course.

Rick.A · 1 August 2021 13:04

For aggregated data I prefer constructs where I find the selected set first and return the matching IDs after, like:

SELECT
	id,
	a_text,
	a_number 
FROM
	test
	INNER JOIN ( SELECT id AS id_found, count( a_number ) AS unique_val FROM test GROUP BY a_number HAVING unique_val = 1 ) AS found ON test.id = id_found;