Lesson 26
Introduction to SQL [raw] - Software School (2024-04-30)
Video Transcript
Like I said, please go to this website, programis.com, so they are SQL editor, and we will run
the examples here.
In this environment, it's very simple.
We're going to type the commands here under the input panel, and we're going to click
the button run SQL in the top right of this input panel, and that will run the query,
and you can see the output here at the bottom of the panel under output.
Left hand side is information about the tables, there are columns, and right hand side is the
data for those columns.
Okay, so today we will learn about SQL.
So essentially today, when you go on your phone, you click your web browser, you access
somebody's website, or you're on social media like Instagram and Facebook, and you see all
this data, right, you have people's information, their names, what they like, their interests,
their posts, etc., etc.
All of that data is stored behind the scenes in what's called a database.
So what happens is your browser is a client that makes a request to some server somewhere,
and that server acts as an intermediate processing your request.
If you ask, hey, I want to know a post about a specific person, then that intermediate server
will ask the database, will query it, hey, database, I need information about the post
for this particular person, and then it will take that and then redirect that back to the
client.
Okay, so behind the scenes, that's part of the back end of software development.
It's stuff that people don't see, whereas the browser stuff that we see is called the
front end with HTML, CSS, and JavaScript.
So let's talk about SQL.
SQL stands for a structured query language, and it's a standard interface by which we
can operate on certain data on a database.
SQL is probably the most popular type of database today.
Now to understand SQL, let's look at a spreadsheet.
I think that'll be easier.
So in spreadsheet, we have rows and columns, right?
Here's on Excel right now.
And the way it works is, so we got a database, a database has many tables, and you can think
of them like this.
For example, I want to keep track of information by certain people.
I can keep track of first name, last name, you know, address, so on.
So I can make them these two cells like this.
So I can have multiple people, right?
I can have John, Bill, I can have Mary, and so as you can see this is highlighted.
So we can think of this as a table of people.
And then we have columns, the first being first name, and the second being last name.
Now every column does the value.
Now we call them every person a row.
So there's a row for John and Doe, and a row for a person of first name Mary, and last
name Ann, and so on.
So that's basically the most basic concepts of SQL.
We got a table, a table has many columns, and each column holds a specific type of information.
In this case it's just characters, but it could be numbers, or maybe a value that's
true or false, that kind of stuff.
Whereas every row represents an object or something related to a unique record.
In this case it's people, right?
One person per row.
You could also keep track of orders if you order something online on Amazon, for example.
So every row here would be an order, and an order has many details about it, right?
You can see who ordered it, what time was it ordered, what the payment details, those
would be like columns.
Okay, now going back to program is, for those who just got here, please go to that website.
Those who just got here, I posted a link to the Zoom chat where you should go.
Okay, so in this environment here, we already have tables set up for us.
It's going to be easy.
We don't have to create the tables, all right?
So let's delete everything from this.
So this environment has some tables about customers, their orders, and their shippings.
So customer, somebody will make an order to purchase something, and eventually that something
will be shipped like Amazon, not home.
So how do we go about accessing the information about customers, for example?
We can start off by using the select.
Now you're going to see me type the words, the keywords, the capitalized, and that's
just a naming convention.
You don't have to follow it if you're comfortable starting out with just lower case for the
keywords, that's fine.
So select is a keyword.
So what we're doing here is we're writing a SQL query, because we want to obtain information
about a certain table, in this case, customers.
So we have to select columns.
So what do you want to know about?
Do you want to know the first name?
Do you want to know the last name of the customer?
Do you want to know the age, or do you want to know every single thing I can possibly inform
you?
In this case, it would be star, meaning every column.
And then you say from keyword, and then the name of the table, customers.
And then click run SQL there.
Notice I finished a semicolon there.
Now you can see the bottom here.
This is the output of the query.
Let me see if again, there's no thing to make it pull up.
So this is a table, and we have columns.
The first column is customer ID, which is a unique identifier.
So typically in SQL, you're going to see the ID column most of the time.
It's just a way for us to uniquely identify every row, because it could be the case that
we had like two different people, but with the same name.
So how do we know we're going to differentiate, say the first row is John, and the second
is also John, but they're an entirely different person.
That's why we always leverage an ID column.
That's called a primary key, so we can differentiate each row from the others.
All right, the second column is first name.
So this gives us information about the person's first name.
And you can see the first name for the first row, meaning the first person is John.
And then it goes on, if you read from left to right in the row, that means more information
about the specific person.
Now if you read from the top to the bottom, it means you're reading for different kind
of people.
So the first one will be John, and then if you want to move on to a different person,
you go down to Robert, David, and so on.
Now following the same pattern, there are columns for last name, age, and country.
Now if you notice here on the left hand side of this environment, all these tables were
already defined, and they each have these specific columns, and each column has a specific
data type.
As you can see, customer ID here, let me make it bigger, is of data type int.
Int is short for integer, which is like a number, one, two, three, and so on, with no
decimal, right?
The second column is first name, and the type is vericare 100.
So vericare has a character varying in string, meaning it can vary the number of characters.
It could be one character, two characters, so on, up to 100, and so on.
All right, I switched to dark mode because somebody asked, if you don't like that, let
me know.
All right.
So, how's everybody doing so far?
Okay, so...
So this is our first query.
It's the most basic one called select.
You want to select specific columns from a table, and the syntax is always like this,
select keyword, space, and then you can specify the columns.
If you want all of them put a star, space from and then to the right hand side of from
with a space, the table name.
Make sure to follow the exact table name as they are written here, because some of them
matter the casing.
You know, it's important in programming in general that you always keep track of every
character that you're looking at as you're writing the code, because in some cases, the
case might matter or not, so be careful with that.
Also, if you notice here, in this environment, the tables were already created for us, but
in the real world setting, you probably would have to issue a command to create the table
and specify all the columns, but since we're starting out, we're already given these
tables, and you always want to make sure to finish your statements or your query
statements with a semicolon.
That's not only a sequel thing, it's a programming language in general.
Many programming languages require you to put a semicolon at the end of every statement.
Otherwise, you're going to get a syntax error.
Try removing the semicolon and running it.
This one didn't care, but usually it shouldn't matter.
All right.
So now we are learning a standard language.
So sequel by itself, it's like a template that everybody should follow.
If you want to do a sequel, if you want to have a sequel database, you must speak this
language using these keywords that we're using right now.
But when you're out in practice, you're going to see different flavors of sequel like MySQL,
Postgres, Microsoft SQL Server, Oracle Database, IBM, and so on.
That's like the analog of, you know, we have cars, right?
And we have different makes.
Even though we have different makes, they all kind of follow the same formula when they
build a car.
A car has a wheel, has the steering wheel, has all the doors, and everything is pretty
much a standard, right?
So sequel by itself is like the blow print of how the car should be built.
And then the companies come in and say, okay, this is the blow print from the car.
I'm going to build my own version of this.
So I implement sequel.
I implement my own car.
So this is my Honda.
This is my Toyota and so on.
Okay, now let's learn about select specific colors, so columns.
Instead of star, what if I just want to know the person's last name?
So I can just specify the name of the column here.
Now notice the name of the column, I get that from the left hand side of the definition
of the table, because last underscore name.
Make sure it's exactly like that.
And I click run sequel, you're going to see I only get the last name for every person.
Remember, each row represents a different person.
Now what if I want a first name?
Well, you can also separate this by a comma and say first name.
You don't need the space here to get the first name.
So you can see it always appears in order from left to right, last name first and then first name.
So if you want multiple columns to separate their names by a comma.
Now I typically like to add a space here just for visualization, just a style.
You don't need it.
It will be the same thing when I click run sequel.
Now you can also do first name in the left hand side and put a comma,
and that would just swap the order.
So you're going to see first name first and then last name as the second column to the right.
Now, what if I want to know where the customers are from?
Anybody want to guess, but I would type here.
What column would I select?
Yes, country.
So you can see where people are from.
This is actually good if you're gathering intelligence, you know, some business research for your...
Okay, where are customers from?
So you can see USA, UK, UAE.
Now this is nice like this, but you can see you have multiple people from the same country, right?
If you have like a million records, that is a million rows,
it would be very hard for us to identify all the countries our customers are from.
So we need to find a way to consolidate all of these into just a single country for each.
That is, I don't want to see USA more than once.
I don't want to see UK more than once because I only care if there's at least one person in that country.
I can know that I have a customer in that country.
So to do that, we can remove duplicates if we proceed the column name with the keyword distinct space.
So if I run that, you're going to see now I only get one row for USA, one row for UK, one for UAE.
That way I can easily know where my customers are from.
And I don't have to go through one million rows.
I think this would be at most 193 or something, right?
Because there's only at most 190 something countries.
Yes, we can make a count.
I would show you how to count later.
But if you're curious, you can just say count here.
And you'll give it a number instead of listing everyone.
All right.
Great. Now let's see, how can I look somebody up?
So far we've been doing queries that list a lot of people.
But what if I just want to know information about a specific customer?
Now let's delete everything here.
That's where we have to filter things out.
Let's say I want to find out information about all the columns.
So select star from, now the table is customers.
Now I need to know, I don't care about everyone, but just one person.
And then I want to know information about a person whose last name is Do.
So I can say here where last underscore name, that's the column name, equals, quotes, single, Do and semicolon.
And now it found all the people whose last name is Do.
What's going on here is going through the column, through all the rows and customers,
and showing all the columns where a specific column value, meaning last name, is equal to Do.
Notice I have strings, single quotes around it.
It's important because we want to literally mean the sequence of characters D, D O N E.
And that's usually called a string in computer programming.
It's a sequence of characters.
If you remove the quotes, it might mean something else to the program.
For example, when you don't have quotes surrounding last name, this is something that's not a sequence of characters, literally as it's written.
It's actually referring to the column for the table customers.
So be careful with the quotes.
Okay, one detail here.
If you type lowercase D for Do, what happens?
Ah, result set is empty.
It doesn't find anything.
So be careful with the casing here.
It matters.
Okay, what if I want to narrow down?
Okay, I want to narrow down to a person whose last name is Do, and first name is John.
Now to do that, you just go to where, and you just add literally the word and.
And you're going to repeat the same pattern of column name equals some value.
In this case, the column is first name equals value capital J, John.
Now we can narrow down to only one person whose first name is John, last name is Do.
Now this particular data set, you can see the right hand side.
This is the data set, the column, how many rows you have, and what rows they are.
It's very small, but I would bet if you had, you know, how many people we have in the world?
Billions, right?
So if I had a database with even a million, it would be very likely that you would find multiple people with the same last and first name.
So be careful with these kind of queries.
So it's usually better to look customers up by their unique ID.
In this case, if I want to exactly match John Doe without caring if there is more than one person with that name,
this specific John Doe, that's the first one here, I would go by his customer ID equals one, semicolon.
Notice I didn't put quotes around the number because that's a neat data type, which is different from Veracara, that is a string.
So this one is okay to not have quotes.
So if you do that, it will match, oops, I put, I forgot underscore ID there.
It must match the exact name of the column.
Now I found the customer whose ID is one, which is, that is John Doe.
This is the best approach when you're looking for some one record for a specific entity, in this case, the person.
Somebody asked, I tried selecting someone with a specific ID, but I had to enclose the ID inside quotes.
Does that happen in all SQL engines or is it just programments?
In general, SQL, if the data type is a number like int, you don't need quotes.
I don't know which one you tried in this environment.
If it's in this environment here, and the column is defined as int.
So make sure we're on the same page on the same environment here.
Okay, if you're using something else outside this environment, then I have no control.
I have no idea what the data type is for the columns that you created.
So check your table definition.
Make sure that if you're, that column is actually a int.
Select first name for customers across my D is two.
So what you did was using quotes.
Yeah, in this particular environment, the quotes won't affect if you're doing int.
Yeah, so depending on the implementation of SQL, things might change a little bit.
You know, I don't know what program is specifically is using underlying this environment.
If it's my SQL, if it's Maria DB, I don't really know.
So every flavor my SQL kind of have its own gotchas and, you know, quirks.
But in general, you don't need quotes when you have a data type that's int.
Okay, were you able to run it?
Yeah, okay.
All right, so that's all nice.
But how about how do we change the table, the records, adding someone new?
That's what we're going to talk about next.
So the key words here to insert new records is insert into separated by space.
Now you need a table name.
What table do you want to operate on customers?
Now you can say values, parentheses, semicolon, and then we're going to include the value for each column in order as they are defined.
So you're going to go on left hand side, you're going to see the first one is customer ID.
So we're going to grab the latest customer ID here.
In this case, I know there's one, two, three, four, five, and then it's going to be six.
But I know it's going to be six, but I just want to show you what happens if I try to insert same customer ID in this case.
Five, it already exists, right?
Okay, let's go to first name.
That's add nana, pama, last name, long, age, 21, country, Canada.
Now I'm looking at this table definition here and I can see the order is customer ID, first name, last name, age, and country.
That's why it must follow the order in the parentheses here.
Okay, make sure to add the quotes for the string or character varying types, bear care.
Now once you run this, you're going to see an error.
Unique constraint failed, customers dot customer underscore ID.
So what this means is for the table customers, when I say dot, it means something that belongs to customer.
In this case, the column customer underscore ID has a constraint that's called a unique constraint.
So we secretly can add what are called constraints to prevent certain operations from passing or getting through.
In this case, it's unique constraint.
This particular column is unique, meaning there can be no more than one row with the same value.
So I cannot have the same customer ID five again, and that's very important because we want to uniquely identify every row, right?
So instead of saying five, I'm going to say six, because I know there's no customer with the customer ID six.
If I do that, it's going to successfully execute.
Now you can see you notice the right hand side there appeared last row for customers with six and along 21 in Canada.
That's what we just added.
Typically, when we're doing these kind of operations, we don't really have to include the primary key.
Remember, the primary key is the uniquely unique identifier.
So you could have just omitted it.
I don't know if it would work in this environment, but most of them you wouldn't need to type it because it would take the latest value here, which be the next integer for the customer ID.
In this case, because we added six, it would be seven.
We can just try here.
Yeah, it doesn't work, but typically we could do that.
Now let me try adding seven comma Anna.
And it works.
So like I said, there can be in this particular table, there's no other constraint for the other columns, meaning I can have the same person.
We can have different people with the same name person less.
And even the aging country can be the same.
So that's why it's very important to have the unique ID to differentiate.
Any questions?
Okay.
Now, this is great, but what if I don't want to follow this order of insertion?
What if I want to first insert by typing the first name, or maybe the last name first and the first name last.
So you can actually specify the order of insertion of these values.
If you go before the values here, and you put parentheses and you specify the column names, let's first follow the current order customer underscore ID,
comma, first name, comma, last name.
If you notice, I always add space after comma, that's not necessary, but I like for style.
After every punctuation, I have a space, last name, age, and then country.
Now this is the original order.
Okay, if I do that, it's going to work.
If I change the ID there to eight.
Now, if what if for some reason, I don't know why you want to type the last name before the first name.
Now, if you do that, you can just swap them here, type the last name first, then follow this first name, fix the syntax,
and then make sure to change it here in your values.
Maybe I want to add like this first, then last, and I change this eight to nine because my next primary key is nine.
I'm watching the table on the right to know what's the last one.
As you can see, I added the one with number plus 99 and the first name last, and I specified last before first name in my values.
This is the way you can swap the order, you don't have to go in order as long as you specify them.
And accordingly, every value must match this one that you specified here.
For example, customer ID must be nine.
Last name is last, first name is first, age 21, countries Canada.
Okay.
Now, so these statements are going very long.
So you notice I put everything one line, but you might see at them in the real world, some people like breaking these statements into multiple lines.
That's totally fine. It won't matter.
Okay.
But there's a specific standard if you want to follow.
I don't know.
Usually people like automatic formatters.
I don't know if they have one here, but they click a button, everything formats accordingly.
They split into different lines, stuff like that.
I'll keep everything in just one line.
Feel free to do multiple lines if you think it's ugly to look at.
Okay, great.
But I got all these records.
They are duplicated.
I know they're duplicated, so I want to remove them.
Let's talk about deletion.
Now deletion, the keywords delete from, but you got to be careful here.
Give the table name, but stop here.
When you say delete from table name, this will delete everything.
All the rows will be gone.
So be careful using delete.
You always want to add a where and never read down to only one thing you want to delete usually.
So you want to add a where.
So typically we would delete by the customer ID, in this case, the primary key.
For example, I want to delete the customer ID nine, which is first and last.
So equals nine.
So this would go into customer stable, delete where the row matches customer ID equals nine.
Now I know customer ID is a primary key, therefore there's only one row with that.
So therefore this delete operation will only delete one thing.
So if you watch the right hand side, it's gone, the one of customer ID nine.
Now, I can definitely narrow this, change this where, but be careful with this.
Usually you just want to delete where primary key equals something.
Now, for some, in this case, I know what I'm doing, and I know there's only a few records.
So I want to delete all of these and alongs.
So I know the last name is long, and there's no other people with the last name long here.
In the real world, there will be definitely be.
So I wouldn't be using this.
But just for the sake of example, I want to delete customers whose last name is long.
So this will delete six, seven and eight, right?
They're gone.
Any questions so far?
Yes.
We got to be careful.
The real load, we have millions, not hundreds of millions of rows.
We don't want to be deleting the whole database, table records.
That's very bad.
You want to act with caution because database is the most crucial thing for every product today.
And typically you want to act on a very, you know, you got to be careful because if the data is gone,
things are going to be very bad.
Is there any way to recover the information once you delete it?
There is a way, but that might be time consuming in a hassle.
That's why you want to avoid it.
Because of course there are ways of, for example, every certain minutes you take a backup like a snapshot of how the data is,
why is it at a specific time.
In that case, if something went wrong, you could technically you could still recover, but that's a hassle.
Why?
Because you would have probably bring down the whole system.
You got to, if you bring it down, customers will complain.
We got people using it business hours.
Things are going to be a master disrupting everybody's work.
Maybe your application has 100 million users and you'll be losing money all this time.
It'll be very painful for the developers having to go on call.
So please be very careful database.
One thing you could do is, for example, and this is on the programming logic side.
You could make what they call a transaction.
So they're going to try to modify some data.
And if something fails, it rolls back.
That means it cancels the operation.
That's that's one way you can be careful about what you're doing.
But when you're working database, you always must come up with a plan for when something goes bad, you can know how to reverse the damage.
You know, if you know it's something that's a table that's not so risky, a product not many people use, you can, you know, it can be easier.
But if it's something core to the product, you got to be really careful.
All right, so let's talk about updating records.
Let's say Robert Luna, some reason moved to Canada.
How would we change his country to Canada?
Well, the keyword is update.
And you give the table name space customers.
Now, we have set, and then the column name equals new value.
This case, country equals Canada with single quotes surrounding.
Now, if I do it like this, semicolon, got to be careful.
This is going to set every single row to have country Canada.
We don't want that.
Okay, I'm going to do it for the sake of demonstration.
You don't have to do it.
I'm going to click Run SQL.
You can see everybody changed to Canada here.
So be careful.
You always want to add a where as well.
I'm going to reset my environment here by refreshing my page.
So the tables will be rebuilt to the original thing.
Of course, this is not how the real world works when it's changed, when it's gone is gone.
But this is just a practicing environment that resets every time I refresh the page.
In any case, I want to add where here.
Now, we can do by first name and last, but why is that bad?
Because there could be a record, another record, or the other people that have the same lane.
So you always want to do as best practice, use the unique row, sorry, unique column.
That's the primary key PK for short.
And that is the customer ID here.
So you want a customer ID.
What's Robert Luna's customer ID?
That's two.
So if you do that, only him and him only will have Canada for the country.
Any questions?
Okay.
And that is update.
I don't have much to say more on that.
Just like I said, the same thing applies from the delete advice.
If you want to update, make sure to use the primary key.
If you do something like first name, last name here, or some other nonunique thing,
it will update all the records, all the rows that might match that.
And it might be the case that you have millions of records that there's for sure one, many of them with the same name.
Okay.
Now let's talk about other tables.
We have another table called orders that show you what each customer purchased.
If you see in the right hand side, order ID number one, this is the primary key.
If you, if you noticed, typically the primary key is always the first column.
It doesn't have to be, but that's the standard practice.
You have, we have the item name, keyboard, how much it was, 400, whatever currency.
Let's assume it's dollar, US dollar.
So customer ID, who purchased this?
Now we see something interesting here that they didn't put the customer information in the stable of orders.
They just put the ID that points to a row in customer stable.
And that's four, and that's very important concept in SQL.
We don't want to be duplicating information across tables.
So we want to focus every table to be dedicated to a specific resource.
And if you need to point to that, just give it the primary key, the ID.
Okay.
So you're not going to be putting first name, last name in orders, whatever.
It's important to separate the concerns.
So we have a single source of truth.
So the single source of truth for customer data is the customer's table.
The single source of truth for order state data is the artist table and so on.
So if you need to reference information across tables, that is possible later as we'll learn with the join.
We can join tables together to match information.
Now when we have a table that matches information from another table,
we usually have the ID, a column that usually the name of the table,
in this case it's just customer ID, that points to the primary key of that table, in this case four.
And this is called a foreign key.
In the orders table, the customer ID is a foreign key that points to the primary key of customers, which is customer ID.
So keep those terms in mind.
So we have FK and we have PK.
All right.
So let's find out what did customer number three order?
Customer number three is David.
So we're going to go select star from orders.
Now I need to narrow it down where the customer ID is three.
Now I can know that this customer ID is number three.
In this case David, he bought a monitor and that's the only order he had.
Now did anybody buy more than one thing?
Seems like if I go customer number four, this person bought keyboard and a mouse.
And if you want to find out about the person, that's number four, which is John.
Now this is kind of tedious to be doing this stuff.
Obviously here I have the table for me to see, but I could have a million records and it would be very painful to have to scroll down a million times.
Or even I don't have access to it.
Maybe it's hidden initially.
I'd have to actually query for that.
This environment is very nice.
It's showing us the data, but now it's always like this.
You have to actually make a command to see that data.
So we want to be doing is joining tables so that we can match this customer ID with the one from customer so we can see side by side.
What's the first name, last name, and so on.
Okay, let's learn about join.
So we're going to go here select start from orders and add space before the where.
I'm going to say inner join separated by space.
Now I want to combine the orders table with the customers because I need some customer information.
And I have to say on what's the condition to match these two tables.
Well, I want the customer ID of the customer to match the customer ID of the order.
But now we got a situation here.
Okay, they have two tables with the same column name.
That's kind of ambiguous.
It doesn't know what it is.
See ambiguous column name if I run the SQL.
So in these cases, if you have tables with the same column name, you have to specify before the name.
What table you want to reference this column from.
In this case, I want the color customer ID from the customers dot customer table.
So you always put the name of the table dot the column.
So let's match on the customers customer ID with the orders dot customer ID.
And it's still giving me the same problem.
Now it's for the where.
So I'm in this in this where it doesn't matter if it's customer orders.
Let's just put orders.
Now you can see the this table here that was generated.
If you notice before we had to join, it was just the first four columns, right?
That's comes from the table of orders.
Now when it matches with the other table with the inner join, it's matching on the same customer ID.
So on the right hand side, if you think if you kind of split this in the middle here, it will kind of paste the column information from the customer stable.
So you can see it's exactly like the customer stable.
So that way we go, we know, okay, customer keyboard is bought the customer ID for whose first name is John last name right hard and so on.
So this is the way to generate reports to combining tables.
That's why it was important for us to have every table have their own single source of truth and you only reference each other via these IDs.
So that way you can build these queries and combine information.
Somebody asked, is there a reason why it repeated the column?
I'm assuming it's customer ID that you said.
And the reason why is because this is just doing a matching of tables.
If you if you imagine there's an imaginary thing here, I don't know if I can put it for you.
Let me see.
Border right with big solid red.
You see that I'm artificially creating a border there.
Using the devs tools.
Imagine everything to the left.
Everything to the right of the red thing.
Ignore that this would be just like the select star from orders.
And that's the order stable query.
When you match them, it will match according to the customer ID.
So when it finds the customer ID for in the other table, it will grab all the columns there and just to put it on the right hand side of this row.
So if you go to the right hand side of the red thing, that's just all the columns from the customers table.
So it's just doing that putting side by side makes matching on the same customer ID.
And the reason all the columns show is because I put star here.
If you want only specific columns, you would have to change the select star to something else.
For example, if I just want the order ID, item, and then maybe customer ID, but you got to say from where in this case, the same value, right?
It's four and four.
They still got to say because it could be from orders or it could be from customers.
By orders dot customer ID.
If I do that, it's like this.
Obviously, it's not useful.
So I want to add customers dot last name.
If it's not ambiguous, you can just drop the table name here.
I don't have to say customers dot last name.
But if you're unsure, always put the table name dot column there.
So if I do that, I can just capture these specific columns.
The star will always have repeated because we have the same column names.
One column is from orders, the other from customers.
And they're always matching.
Imagine visually one table on the left-hand side and you glue the other table on the right-hand side.
Any other questions?
Okay.
So I narrowed this down to just one customer, but you can remove the where to see every single order and the person who ordered what.
Let's add the star back so you can see all the information.
And you can see all the, every single order and the customer information appended to the right-hand side.
Now this is very small data set, right?
In the real world to be millions of millions.
So you probably want to limit your query to just a specific number of records.
If you have a really big table of many records, this is going to take forever and it's going to be very slow.
So you always want to limit in the real world to just a few rows just to inspect the data.
So you can always add limit here.
Let's say I just want to see the two orders, two examples.
I put limit two and it would just show me the very first two that it finds.
So very useful if you're working with a data set and you just want an example of what the data looks like.
Just put a limit, maybe 10 there out of millions, right?
What typically happens if you don't have the limits, you try to query a million rows.
It's going to freeze.
It's going to do something, maybe it's going to crash something in the browser if you're doing it from the browser.
I don't know.
Anyway.
Let's now ask some questions.
How can we ask how many keyboards were sold?
How can we know?
Well, I go to orders, select, start from orders.
And I must know only information about item keyboard.
So where item equals keyboard under quotes.
Now I can do this and see all the orders for keyboards, but that could be a million.
I cannot just go down counting all of them.
One row, two rows, three rows, be hard.
So we want to use what's called a count to aggregate that.
So you're going to say count star with the parenthesis surrounding it here to count the number of rows for this particular query.
In this case, I can know exactly how many keyboards were sold.
In this case too.
Now you can see count star is not a very friendly column name here to have.
So if you want to make a better name for that, we have adiases.
So you can say after that as keyboard sales, and it could be an arbitrary name.
Just so when you run the query, this name here becomes something more friendlier and easier to understand.
Now I know when I read this, oh, this is keyboard sales too.
Not just count star, because count star what?
That's why we can always add aliases to columns with the keyword as and whatever name you want.
Now if you want to have a space between the words keyboard and sales,
now if you do it with like this, you're going to error out.
Typically in programming, when you have spaces, it will consider the thing to be two different things.
One thing is keyboard, the other sales, and that doesn't point to any column or whatever.
So you want to add the quotes surrounding it if you all have spaces.
That way you can have the space here in the alias.
All right, typically we want to avoid adding spaces to things,
especially I don't know if any implementation of SQL would allow spaces for column names,
but that would be a nightmare to work with because of these things having to add quotes.
And even the capital, some SQL implementations might be a pain to work with
if you name your columns with other case letters.
So if you want to keep it simple, just make your column names lowercase
and separate the words by underscore.
That's like the safest choice, we'll have it in pain.
I know that Microsoft SQL likes to use the naming convention of Pascal case,
where they do Pascal case like this, but that's the Microsoft SQL.
I have worked with Postgres and I had capitalized table names
and space as well as a nightmare to work with.
You have to keep adding quotes every time and it was a nightmare,
so I suggest against doing that.
Yeah, snake case is the recommended case that I do.
Anyway, just follow whatever the company you're working for is doing.
They probably know what's best there.
Okay, great.
I want to just left you off with SQL Venn diagram.
There are different kinds of joins.
What we learned today is just one kind of join which is matching records.
I'll share this diagram that I always like to reference
when I'm figuring out what kind of SQL join I should do.
Let me open this in the new tab.
So the one we did, think of every circle here,
circle A and circle B like a Venn diagram.
Circle A is table A, in this case customers,
and circle B is table B which is orders.
So if you want to match them so that one row that appears in one
also appears in another, that is by the customer ID,
that's the one in the middle here, the intersection of these two tables.
That's the inner join.
If you see the syntax here, select whatever columns you want from table A,
the inner join table B on A dot key equals B dot key,
key being the customer ID.
So this is the one we learned today.
But if you ever need other kinds of joins, you can just read this diagram.
I'll post the link to Zoom.
And like this one would select all the rows in A,
but it doesn't necessarily have to be in B.
So it could be that you're finding customers and they don't have any orders.
If you want to match, show the customers here.
Because the one we use, I think it only works if the customer has an order.
So if you had a customer who didn't order anything, it would not show up there.
So if you still wanted the customer to show up despite having no order,
you would probably do this on the left-hand side, that's the left join.
You want to always show the customer record despite not having anything in B
or having, right?
Having anything would be the side of A, right?
The highlighted A, no orders.
And the A having orders would be the intersection.
So I always have this handy when I need to figure out what kind of join.
All right, so I think that's it for today.
And I'll finish.
Do you have any questions?
No comments yet (loading...)
No comments yet (loading...)
Did you like the lesson? ๐๐
Consider a donation to support our work: