Loading
Lesson 44
Courses / HackerRank SQL Problem Solving Solution Challenges
Occupations | SQL Advanced Select | HackerRank Solution

Video Transcript

Hey, I will do the challenge occupations from the subsection advanced select of SQL and hacker rank. So we're given this table occupations that has the columns name and occupation and we're asked to produce the following. The simple output should be a table. Each column is a list of names that follow the same profession. For example, in this case here, we see that we got Jenny and Samantha in the output for the first column. What is that column about? Now, if you look at the input, Samantha is a doctor and Jenny is also a doctor. So we take all the names whose occupation is doctor and place them as the first column here. And then we do the same for every other occupation. For Ashley, that's a professor, Christine and Kedi, we place them in the second column because they're all with the occupation professor. Do the same for the actor and the singer one. I will solve this problem using window functions, derived tables and aggregate functions. And now we use my SQL. So to help you understand how much solution is going to go, I want to first talk about window function. Let's say we're going to just select name occupation from occupations first so we can understand what the data looks like. So as you can see, we got the name there corresponding occupation. Now let's say I want to count how many rows I have. So I can do select count star from occupations, right? And click run code. As you can see, there are 18 rows. But what happened here was the 18 rows collapsed into a single row with a number in the output. What if I wanted to have that 18 for every single row and I wanted to print the original rows but with the 18 as the third column? Now to do that, I can do a window function like this. If I do select count star over like so, and I can even add the name and occupation back here so you can better visualize it. So when I say the over there with the parentheses, I can now get that count for every row so you can see 18 is in every row. So it's important we understand this concept because we're going to use it to produce the output for this problem. And we can go even further. What if I want to count on a partition discount? Say I want to count just a number of people in a certain profession. So I can go here inside the parentheses for over and I can say partition by and I'm going to give occupation. So if I do that, it's going to count by occupation. So we can see Samantha is an actor and now it's no longer 18 rows. It's four because in total, it found four rows that all have occupation actor. So we can see every actor now have four. For doctor, they'll have three because they're in total three doctors and for professor seven. Okay. Now we can go further. What if we want to take these and order by name? Say I want Eve to come before Samantha. I can go inside partition by occupation and say order by and I can say name. So as you can see, it orders by name. Eve comes before Jennifer, Ked and Samantha. So we're going to keep these concepts in mind because you're going to use for the solution. So now I want to create a column for each of these professions. So I can let's start with just the one, the first one that's going to be doctor. So I'm going to go here. We have one, two, three, and then going to create the fourth column. And for this column, I want to display the value for the name only if it matches the profession or occupation that I want. In this case, I'm going to create the column for doctor. So it should match occupation equals doctor. So to do that, I'm going to use an if statement. You can also use case or whatever other thing. So if the occupation is equal to doctor, what we're going to do, we're going to display the name here. Otherwise just going to say no. Okay. So what's going to happen if I run this code? So as you can see, we have the last column that's before and I have Amina, Julia and Priya. And these are actually what's going to be in the final results. We just need to get rid of all these unnecessary nodes before. Okay. So there will be some nodes that will appear in case that one of the columns has more rows than the others. For example, this one, Jenny, Samantha, no, and then Ashley, Christine and Katie. So but we don't have to have so many no. So we're going to get rid of them later. But for now, this column is for doctor. Then you repeat the same pattern for all the three other professions. Okay. So the same pattern if you can do with one, you can do with all of them. So we're going to do here, I'm going to create another column. And this one, if the occupation is equal to professor, we're going to display the name. Otherwise, we're going to say no. If I click round code and verify what it looks like, as you can see, it now appeared right here. And this is because you can see Ashley is selected because that's the row for Ashley. Velvet selected, that's because the S row for Velvet. And the Darault professors, as you can see. So I left the name and the occupation here, just so we can visualize what's happening, how the data is being shaped. But in the end, the final result, you don't need to display or have these two in your query. Okay, the name and the occupation. Let's do the other ones. So after professor, you're going to say if occupation is equal to singer, we're going to display the name. Otherwise, else case, we're going to do no. Okay. So the argument is the if condition. The second is what happens if that is true. And the third argument is the else case, what happens if that condition is false? So we return no. Comma, let's do if occupation is equal to actor. Then we're going to name. Otherwise, else we do no. Okay. So we got all four of those. Okay. So let's run the code and see what we get. As you can see, now we have the columns for actor here in those rows for doctor in those rows for professor in those rows and for singer on in those rows. Now we got a shape this so it's better to the final result here, right? Got rid of all these no's and remove the first three columns. So we can get rid of name occupation here is not necessary. And what I'm going to do is I have to do a derived table here. So I'm going to select something. Okay. I'll do it in a second from this whole result, the stable that we created. Okay. Now, because my SQL when you do derive tables as for an A list, we have to say as some name, whatever I'm going to say derived table. Now what I'm going to do here is I want to select an any an aggregate function. It could be mean, for example, I just need an aggregate. That's the thing because I'm going to do here group by and I need to group by this value here value of the remember that number the 123 4 based on the actor or 123 based on doctor, I need that number here. So I need I'm going to create an alias for this so I can reference in this group by outside. So I'm going to say here, we can say maybe row numbers and like that. Okay. And here outside me is the same name row number. So I'm going to group by them. That means I when I have the one here, all the ones you can see one one, I'm going to group them meaning I collapse them into a single row. So this one will be combined with this one will be combined with that one will be combined with that one. Right. So if I take them here so you can better visualize and I'll paste them here so you can see this one. Do I do actually? Yeah. So we're going to take all these combined based on this number because the same number they all become one. So you're going to have let me remove those because we don't need them. So essentially you're going to have a Mina here. We got Christine here. We got Ashley here. So something like this. Okay. So that they are each in their proper column without all those extra stuff. Okay. So that's what it's doing when you do a group by that. So to do this I need to do a select here and I'll select using an aggregate. So I'll use min. Okay. I need to reference the value of the column for the doctors. I need a name is here to make it easy for me to reference there. So I can see as whatever name you want, you can get colleges doctor and you can say doctor here and then you do the same thing for every other column. So I'm going to do a common min professor and at this name professor has to be the alias for this one as professor. And then I'm going to do min of singer and this name singer has to match the alias for this one right here as singer. And finally I have to do min of actor and this name actor here has to match the alias for this column here. So I'm going to say as actor. Now let's run the code. Now we get an error and that's because I used row number here and that's probably a keyword. So I have to say another name that's not exactly row number. I remove the E and just number. Let's see if that works. And there you go. Now we can see we got those columns consolidated into one and you can see the first column for doctor, the second for professor and the third for singer, the fourth for actor.
No comments yet (loading...)
No comments yet (loading...)
Did you like the lesson? 😆👍
Consider a donation to support our work: