General Issues in Database Design and programming

fguy64

New Member
Messages
218
Reaction score
0
Points
0
greetings, I am just getting my feet wet with php and mySQL, and I wanted to discuss some basic concepts typical to all relational database systems. I posted this in the Compters and Technology forum but didn't get a response, so I thought I'd try in this board

We can start off with Primary Key fields and Index fields. If people have other RDBMS concepts to discuss, then I guess this would be a good place.

As I see it, loosely defined, an index field is a specially defined field in a table that aids in query and sort operations

for example, if you had a table called customers with fields cust_id, firstname, lastname, address, then last name might be an appropriate index field if you were doing queries such as "show me all the Smiths", or "sort this table by last name".

if you wanted to link to another table, say orders, then the last name would not be appropriate. intuitively, you can imagine the trouble you would run into with a large number of customers and you queried something like (show me all the orders from Smith). Cause Smith is not unique.

So you would query on cust_id, cause it is unique, and thus is called a "primary key field" for this table.

So based on this, I have a few questions to pose...

Is it correct to say that a primary key field is just an index field whose values are unique. ? Is this a useful distinction to make?

If this is true, then why is it when you are creating a table using the myPHPAdmin tool in cPanel, that part of the field definition includes radio buttons for Primary Key, Index, and unique. If my definition of index and primary key is true, then why do we need the "unique" radio button.
 

garrettroyce

Community Support
Community Support
Messages
5,604
Reaction score
245
Points
63
Unique fields don't have to be indexes and indexes don't have to be unique. Primary keys are just the name for having both attributes, unique and index, at the same time as well as enforcing NOT NULL.

this explains it pretty well:
http://en.wikipedia.org/wiki/Primary_key
 
Last edited:

fguy64

New Member
Messages
218
Reaction score
0
Points
0
Unique fields don't have to be indexes and indexes don't have to be unique. Primary keys are just the name for having both attributes, unique and index, at the same time as well as enforcing NOT NULL.

this explains it pretty well:
http://en.wikipedia.org/wiki/Primary_key

Thanks Garrett, that clears it up nicely. I would like to make one small point, based on general logic and not database...

depends what you mean by unique. here you say that unique fields don't have to be indexes. In the context of the way the cPanel admin tool uses radio buttons, which are mutually exclusive, then I would say a field designated "unique" can't possibly be an index field, cause then it would be designated as Primary Key.

Do you agree? Sorry if I'm being pedantic, it's in may nature to be precise.
 

garrettroyce

Community Support
Community Support
Messages
5,604
Reaction score
245
Points
63
You're right :)

It would depend on the database type, I imagine. There's another stipulation that separates a unique/index from a primary key; primary keys can't be null. Logically though, a unique field can be null, as long as that column only has one row with a null. And there's nothing stopping indexes from being null, except good table design ;)

Again, this may vary by database type. I've never tried to create a nullable index or unique in mysql.
 
Last edited:

fguy64

New Member
Messages
218
Reaction score
0
Points
0
To expand on the null thing, it sounds like you are saying that MySQL forces you to populate the primary key field of all your records in the table, it just isn't possible to do otherwise.

With an ordinary index field, it sounds like it is possible, but having a null field is kind of like bad data, it just means your search or query wont include that record? Or will it actually break the query, in which case it is more than just bad data.

interesting comment about unique fields having only one record with a null in that field. Kinda makes sense, or else it wouldn't be unique :) . Again, are you forced into that requirement by MySQL, when creating the records, or will it just break your queries.

Thanks much.
 

garrettroyce

Community Support
Community Support
Messages
5,604
Reaction score
245
Points
63
I'm not sure how Mysql does this, I haven't done either because I just think it's a bad idea to have an index or unique. I'd just pretend it's impossible :p

It won't break your query, it's completely valid, but it's bad database design. It's saying that the index is such important data that you're going to index all of the results, but it's ok if the result is nothing. Why prioritize nothing?
 

fguy64

New Member
Messages
218
Reaction score
0
Points
0
OK that makes sense. It all speaks to the responsibility of the programmer to make sure certain field get populated, probably by providing appropriate input validation for user input.

Thanks again, learned some good stuff this thread.
 

misson

Community Paragon
Community Support
Messages
2,572
Reaction score
72
Points
48
In discussing indexes and primary keys, it's helpful to distinguish the view of each in RDB theory and implementation.

In the theory, a "primary key" is a minimal set of fields such that the other fields in the row are functionally dependent on the key. The primary key for a table is a candidate key. There is only one primary key for a table, while there can be multiple candidate keys.

In practice, MySQL primary keys are unique, non-null, indexed fields.

As I see it, loosely defined, an index field is a specially defined field in a table that aids in query and sort operations
A more useful view is to talk about indexed fields. An index is not a property of fields or a special field but (in practice) a separate data structure that refers to the table and that can increase performance. From the theoretical viewpoint, an index is entirely redundant. It's useful in the theory for query optimizations, but not for table design.

if you wanted to link to another table, say orders, then the last name would not be appropriate. intuitively, you can imagine the trouble you would run into with a large number of customers and you queried something like (show me all the orders from Smith). Cause Smith is not unique.
Non-intuitively, last name is not a foreign key because it's not a candidate key (i.e. the other fields in both tables aren't functionally dependent on last name).

Is it correct to say that a primary key field is just an index field whose values are unique. ?
That's almost true in terms of (some) implementations. In MySQL, a primary key is implemented with a unique, non-null index. In RDB theory, a primary key doesn't involve an index.

Logically though, a unique field can be null, as long as that column only has one row with a null.

True for MSSQL, but not for MySQL and PostgreSQL. Since NULL isn't equal to NULL, you can have an arbitrary number of rows with a NULL in a UNIQUE field. This is conceptually sound because NULL means "missing value" or "field inapplicable". For example, a driver's license field might be UNIQUE (as no two drivers in one state will have the same ID) but allow multiple NULL values (as not everyone drives).

Try this:
Code:
CREATE TABLE nullable (
  id INT UNIQUE DEFAULT NULL,
  value VARCHAR(64) NOT NULL
);

INSERT INTO nullable (value) VALUES ('foo'), ('bar');
SELECT * FROM nullable;

Indexing a NULL-able column isn't "good" or "bad" from a design standpoint. Remember, indexes are about performance, and you will usually get a performance gain when you query against an indexed field. Whether a column should allow NULL values is entirely dependent on whether it makes sense to allow for missing values or inapplicable fields. For example, to handle people like Cher and Madonna who don't have last names, a "last name" column should allow NULLs. Note that you'll probably end up indexing the last name field. This isn't to say that "if you find yourself indexing a NULLable column, re-examine your table design" isn't a useful rubric for some people, just that if you follow other design principles, it isn't needed.

Much of table design comes from database normalization, which involves the notions of "keys" and "dependencies". Maurer and Scherbakov's "Relational Data Model" looks to be a decent text on the subject. For other information on RDB design, look to Codd's seminal "A Relational Model of Data for Large Shared Data Banks".
 
Top