Database Puzzle

DaveS · 11 August 2020 16:18

I have a table with a simple structure

ID
Name
Property
Value

for each ID/NAME there will be N property/Value combinations

what I need to do it two things.

One : Select All the Unique NAME/PROPERTY/VALUE records , but ONLY if ALL ID;s have the same data… That is I want to know what Property/Values are shared across all instances of “NAME”

Two : Select all the rest including the ID.

So if there are 10 “name” records… and all have a “Background=BLUE” then I need that data in scenario #1, but if even one record is NOT blue, then I need all 10 records in scenario #2

npalardy · 11 August 2020 17:12

A select distinct for the first ?
that is assuming I understand you question right which I might not
Not sure about what “data” you mean (the VALUE?) so the group by might not be right
something like

   select distinct name, property, value from table group by ID

is this for representing an “object” and all its properties and those values ?

DaveS · 11 August 2020 17:15

That won’t work…

1 color blue
2 color blue
3 color blue
1 border 3
2 border 3
3 border 2

Scenario #1 should return “COLOR BLUE”
where Scenario 3 would return all the “border” records

npalardy · 11 August 2020 17:39

you said this had 4 columns - ID, Name, Property, Value but the data you posted has 3 ?
I’m confused ?

DaveS · 11 August 2020 17:48

you are correct… assume example, all have the same name

1 xyz color blue
2 xyz color blue
3 xyz color blue
1 xyz border 3
2 xyz  border 3
3 xyz border 2

query #1 should return “XYZ COLOR BLUE”
query #2 should return

1 xyz border 3
2 xyz  border 3
3 xyz border 2

JeffT · 11 August 2020 17:53

select property, value from table where property+value in
(select property+value from table group by name,property,value having count(1) = 1)

union

select all property,value from table
where property+value not in
(
select property+value from table where property+value in
(select property+value from table group by name,property,value having count(1) = 1)
);

although its difficult to work out what you want…

npalardy · 11 August 2020 17:59

would it be correct to say that for case 1 that you should get xyz color blue because there are 3 id’s for xyz and each name/property pair (xyz color) value is blue (they all match for that property)
and you should NOT get border because while it has 3 values they do NOT all match

and the second query is all the items that were NOT in the first ?

DaveS · 11 August 2020 18:03

Norman… that sounds correct… This is for “property values” of a “control set”, I first what all the property values that ALL members of the control set SHARE, then I want all the properties that are NOT shared (for those I need to know their ID as well)

Jeff… I made a test table that mirrored my example,


select property, value from test where property+value in
(select property+value from test group by name,property,value having count(1) = 1)

only returned

values(3,"xyz","border","2");

DaveS · 11 August 2020 18:08

I think query #1 should be


   SELECT property,value
     FROM test a
    WHERE property + value IN (SELECT property + value
                                 FROM test
                                 GROUP BY name,property,value
                                 HAVING COUNT(*) = (SELECT Count(*) 
                                                       FROM TEST b 
                                                      WHERE  a.property=b.property))

and query #2 would replace COUNT() = with COUNT() <>

npalardy · 11 August 2020 18:13

DaveS:

 SELECT property,value
     FROM test a
    WHERE property + value IN (SELECT property + value
                                 FROM test
                                 GROUP BY name,property,value
                                 HAVING COUNT(*) = (SELECT Count(*) 
                                                       FROM TEST b 
                                                      WHERE  a.property=b.property))

coming to much the same conclusion myself that a count of id’s or something was required

the outermost might need to be a distinct name, property, value

EDIT : and query #2 would NOT use distinct
~~That does seem to work given the sample data~~
scratch that

DaveS · 11 August 2020 18:17

crap… the query needed to also return the name… adding that breaks this query

npalardy · 11 August 2020 18:17

If I start with

create table stuff ( id, name , property, value ) ;
              
insert into stuff values (1, 'xyz','color','blue') ;
insert into stuff values (2, 'xyz','color','blue');
insert into stuff values (3, 'xyz','color', 'blue');
insert into stuff values (1, 'xyz','border', 3);
insert into stuff values (2, 'xyz','border', 3);
insert into stuff values (3, 'xyz','border',2);                                                     
insert into stuff values (11, 'abc','color','blue') ;
insert into stuff values (12, 'abc','color','blue');
insert into stuff values (13, 'abc','color', 'blue');
insert into stuff values (11, 'abc','border', 3);
insert into stuff values (12, 'abc','border', 3);
insert into stuff values (13, 'abc','border',2);

the queries dont do what I understand they should - have I set something up wrong for sample data etc ?

DaveS · 11 August 2020 18:18

yeah I just discovered that… here is my current test script

DELETE FROM test ;
INSERT INTO test VALUES ( 1,"xyz","color", "blue") ;
INSERT INTO test VALUES ( 2,"xyz","color","blue") ;
INSERT INTO test VALUES ( 3,"xyz","color","blue") ;
INSERT INTO test VALUES ( 1,"xyz","border","3") ;
INSERT INTO test VALUES ( 2,"xyz","border","3") ;
INSERT INTO test VALUES ( 3,"xyz","border","2") ;
INSERT INTO test VALUES ( 4,"abc","color","red") ;
INSERT INTO test VALUES ( 5,"abc","color","blue") ;
INSERT INTO test VALUES ( 6,"abc","color","blue") ;
INSERT INTO test VALUES ( 4,"abc","border","3") ;
INSERT INTO test VALUES ( 5,"abc","border","3") ;
INSERT INTO test VALUES ( 6,"abc","border","3") ;

   SELECT name,property,value
     FROM test a
    WHERE name+property + value IN (
   SELECT name+property + value
     FROM test
    GROUP BY name,property,value
   HAVING COUNT(*) = (SELECT Count(*) 
                        FROM TEST b 
                       WHERE a.name=b.name 
                          AND a.property=b.property))

npalardy · 11 August 2020 18:19

query #1

 SELECT distinct name,property,value
     FROM stuff a
    WHERE property + value IN (SELECT property + value
                                 FROM stuff
                                 GROUP BY name,property,value
                                 HAVING COUNT(*) = (SELECT Count(*) 
                                                       FROM stuff b 
                                                      WHERE a.property=b.property
                                                        AND a.name = b.name )) ;

query # 2 does swap COUNT() = for COUNT() <>

EDIT Query #2
query # 2 also doesnt use the outer most DISTINCT

 SELECT name,property,value
     FROM stuff a
    WHERE property + value IN (SELECT property + value
                                 FROM stuff
                                 GROUP BY name,property,value
                                 HAVING COUNT(*) <> (SELECT Count(*) 
                                                       FROM stuff b 
                                                      WHERE a.property=b.property
                                                        AND a.name = b.name )) ;

EDIT This seems to work as intended

DaveS · 11 August 2020 18:23

ok… staying with query #1… I double checked , and your query returns all but one of the test records…

npalardy · 11 August 2020 18:26

not here in query 2 IF you remove the “distinct” which I did neglect to do in previous posts

npalardy@server ~ % sqlite3
SQLite version 3.28.0 2019-04-15 14:49:49
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table stuff ( id, name , property, value ) ;
sqlite>               
sqlite> insert into stuff values (1, 'xyz','color','blue') ;
sqlite> insert into stuff values (2, 'xyz','color','blue');
sqlite> insert into stuff values (3, 'xyz','color', 'blue');
sqlite> insert into stuff values (1, 'xyz','border', 3);
sqlite> insert into stuff values (2, 'xyz','border', 3);
sqlite> insert into stuff values (3, 'xyz','border',2);                                                     
sqlite> insert into stuff values (11, 'abc','color','blue') ;
sqlite> insert into stuff values (12, 'abc','color','blue');
sqlite> insert into stuff values (13, 'abc','color', 'blue');
sqlite> insert into stuff values (11, 'abc','border', 3);
sqlite> insert into stuff values (12, 'abc','border', 3);
sqlite> insert into stuff values (13, 'abc','border',2);
sqlite>                                                       
sqlite> SELECT distinct name,property,value
   ...>   FROM stuff a
   ...>  WHERE property + value IN (SELECT property + value
   ...>                               FROM stuff
   ...>                              GROUP BY name,property,value
   ...>                             HAVING COUNT(*) = (SELECT Count(*) 
   ...>                                                  FROM stuff b 
   ...>                                                 WHERE a.property=b.property
   ...>                                                   AND a.name = b.name )) ;
xyz|color|blue
abc|color|blue
sqlite> 
sqlite> SELECT name,property,value
   ...>   FROM stuff a
   ...>  WHERE property + value IN (SELECT property + value
   ...>                               FROM stuff
   ...>                              GROUP BY name,property,value
   ...>                             HAVING COUNT(*) <> (SELECT Count(*) 
   ...>                                                   FROM stuff b 
   ...>                                                  WHERE a.property=b.property
   ...>                                                    AND a.name = b.name )) ;                                                      
xyz|border|3
xyz|border|3
xyz|border|2
abc|border|3
abc|border|3
abc|border|2
sqlite>

DaveS · 11 August 2020 18:26

try this data then…

INSERT INTO test VALUES ( 1,"xyz","color", "blue") ;
INSERT INTO test VALUES ( 2,"xyz","color","blue") ;
INSERT INTO test VALUES ( 3,"xyz","color","blue") ;
INSERT INTO test VALUES ( 1,"xyz","border","3") ;
INSERT INTO test VALUES ( 2,"xyz","border","3") ;
INSERT INTO test VALUES ( 3,"xyz","border","2") ;
INSERT INTO test VALUES ( 4,"abc","color","red") ;
INSERT INTO test VALUES ( 5,"abc","color","blue") ;
INSERT INTO test VALUES ( 6,"abc","color","blue") ;
INSERT INTO test VALUES ( 4,"abc","border","3") ;
INSERT INTO test VALUES ( 5,"abc","border","3") ;
INSERT INTO test VALUES ( 6,"abc","border","3") ;

DaveS · 11 August 2020 18:29

I get this
“name”,“property”,“value”
“xyz”,“color”,“blue”
“xyz”,“color”,“blue”
“xyz”,“color”,“blue”
“xyz”,“border”,“3”
“xyz”,“border”,“3”
“xyz”,“border”,“2”
“abc”,“color”,“red”
“abc”,“color”,“blue”
“abc”,“color”,“blue”
“abc”,“border”,“3”
“abc”,“border”,“3”
“abc”,“border”,“3”

npalardy · 11 August 2020 18:33

yeah seeing that the outer selects property + value criteria is NOT selective enough

so abc’s color may get reported when we’re looking at xyz’s color - which is blue

this seems to fix that

SELECT distinct name,property,value
  FROM stuff a
 WHERE name || '-'|| property ||'-' || value IN (SELECT name || '-'|| property ||'-' || value
                              FROM stuff
                             GROUP BY name,property,value
                            HAVING COUNT(*) = (SELECT Count(*) 
                                                 FROM stuff b 
                                                WHERE a.property=b.property
                                                  AND a.name = b.name )) ;

and I get

sqlite> SELECT distinct name,property,value
   ...>   FROM stuff a
   ...>  WHERE name || '-'|| property ||'-' || value IN (SELECT name || '-'|| property ||'-' || value
   ...>                               FROM stuff
   ...>                              GROUP BY name,property,value
   ...>                             HAVING COUNT(*) = (SELECT Count(*) 
   ...>                                                  FROM stuff b 
   ...>                                                 WHERE a.property=b.property
   ...>                                                   AND a.name = b.name )) ;
xyz|color|blue
abc|border|3

npalardy · 11 August 2020 18:35

And query #2 then looks like

SELECT name,property,value
  FROM stuff a
 WHERE name || '-'|| property ||'-' || value IN (SELECT name || '-'|| property ||'-' || value
                              FROM stuff
                             GROUP BY name,property,value
                            HAVING COUNT(*) <> (SELECT Count(*) 
                                                 FROM stuff b 
                                                WHERE a.property=b.property
                                                  AND a.name = b.name )) ;

and that gives

xyz|border|3
xyz|border|3
xyz|border|2
abc|color|red
abc|color|blue
abc|color|blue