PSU 2012 SQL: Introduction SQL: Introduction The PowerSchool database contains data that you access through a web page interface. The interface is easy to use, but sometimes you need more flexibility. As your database grows, you may want to search in a more detailed manner, or access data from across schools, and the interface is not the easiest way to do that. With SQL (Structured Query Language), you can make detailed queries across the whole database. By the end of this session, you will be able to: Describe the basic database structure of PowerSchool Use SQL to write basic queries with filters and simple joins of two tables Use SQL with PowerViews Relational Databases Spreadsheets are used to collect and organize data. Columns contain fields, and rows contain instances of the fields. Schools could use spreadsheet programs like Excel to store and organize student data, but the spreadsheets would be too large to work with. Relational databases work much like a set of interlinked spreadsheets. They are sets of tables that can be connected by data shared across the tables. For example, in PowerSchool, the Students table connects to the StoredGrades table using a student ID that they have in common. The tables have different names for the data (ID in the Student table and StudentID in the Stored Grades table), but the data is the same. In these tables, the columns represent fields, and the rows are filled with instances. To read the tables, go down to the row you want, and then read across to find all the data for that instance. For example, in the Students table, read down until you find the correct last and first name, and then read across to find grade level, gender, and other information about the student. You can make database diagrams to show the relationships between tables. In these diagrams, tables are represented by boxes filled with their field names and connected by lines to show how they link. You can find information on PowerSchool s data structure of 300 tables in the Data Dictionary, available through PowerSource. Find some of the more common tables by going to Direct Database Export (DDE). Access DDE by clicking System > Direct Database Export (DDE). Activity 1 Examining Tables and Diagrams In this activity, practice one of the two ways of finding information about PowerSchool s data structure, and see how some tables are related. 1. Using the Database Dictionary or DDE (Direct Database Export), find the Students and CC tables. What fields link the two tables? 2. If you want to make a list of letter grades, student names, course names, gender, and grade levels from one school and one term, what fields should you use from the database? Answer: 3. Examine the database diagram in the slide shown. What is the relationship between the tables as pictured in the diagram? Answer:
Why Use SQL? SQL is a powerful tool, but sometimes a power saw is too much, and a paring knife will work. So how do you decide which tool to use Quick Export, DDE, or SQL? It depends on the details of the job. If you want just a few fields from the Students table, then it makes sense to use Quick Export or DDE. You can join two tables in Quick Export and DDE, but joining tables in SQL is easier. The Basics of SQL SQL is the language of databases. Your PowerSchool license allows you to construct SQL statements called queries to pull data directly from the database. This course covers the basics of how to write these queries. SQL queries have three main components the SELECT clause (get what data?), the FROM clause (from what tables?), and the final clause, which holds special instructions (like filter or order the data). To get data from more than one table, add a JOIN statement and specify where to match the rows in the different tables (ON). For each table you add to the query, an additional JOIN is required. In a query of two tables, use one JOIN; in a query drawing from three tables, use two JOINs; and so on. You can also sort data and filter data using the SQL commands ORDER BY and WHERE. Other SQL calculations and formatting changes are covered in the Advanced SQL class. When typing SQL queries, the convention is to type SQL commands in all caps. The database doesn t require capitalized commands, but using all caps makes reading the queries or checking them for errors easier. When joining tables, indicate the tables from which the fields in your SELECT clause are coming. In the SELECT clause, type each field name preceded by the appropriate table name, separated by a period. This syntax tells the query which tables to scan that contain the field names and information you want. Activity 2 Writing SQL Queries You ll build several SQL queries, working sequentially and adding complexity. Open SQL Developer, connect to your assigned server, and open the query pane if it does not open automatically. For this activity, analyze the distribution of letter grades in your student population. 1. To start, type this basic query: SELECT lastfirst WHERE schoolid=100 Click Execute Statement using the green right-facing arrow above Enter SQL Statement. What did the query return? 2. Add gender and grade level to the query by putting their field names in the SELECT clause: SELECT lastfirst, gender, grade_level WHERE schoolid=100 Click Execute Statement. What did the query return? 3. Join the StoredGrades table to the query by adding a JOIN statement and an ON statement: Copyright 2012 Pearson Page 2
SELECT students.lastfirst, students.gender, students.grade_level JOIN storedgrades ON students.id=storedgrades.studentid WHERE students.schoolid=100 The table name goes after the JOIN, and the fields to use for the join go after the ON. Specify which tables each field comes from when making a join. 4. Add the course_name, grade, and termid fields from the StoredGrades table to the SELECT statement. Also, add termid to the WHERE filter, and use the ORDER BY command to sort by last name: SELECT students.lastfirst, students.gender, students.grade_level, storedgrades.course_name, storedgrades.grade, storedgrades.termid JOIN storedgrades ON students.id=storedgrades.studentid WHERE students.schoolid=100 AND storedgrades.termid>=2200 ORDER BY students.lastfirst Typing every field name preceded by its table name can be tedious, especially when the query is long and complex. To shorten your query, use an alias. To make an alias, type the letter or letters you want to use for the table name after the official name of the table in the FROM statement. For example, if you want the letter "s" to stand for the Students table, define "s" as the alias in the FROM statement like this: s. You can also create an alias for a field to rename columns in the output data. 5. To make the above statement and its output easier to read, make aliases for the tables and rename the grade field from the StoredGrades table as "Letter Grade." SELECT s.lastfirst, s.gender, s.grade_level, sg.course_name, sg.grade "Letter Grade", sg.termid s JOIN storedgrades sg ON s.id=sg.studentid WHERE s.schoolid=100 AND sg.termid>=2200 ORDER BY s.lastfirst 6. To save this query, use the File menu and choose Save. Give the query a logical name. To begin a new query, clear the command window, or choose File > New, and select SQL file. When working on a grant proposal, generate teacher demographic data by school. 7. Start with a query of the Teachers table; use aliases for the table names. SELECT t.lastfirst, t.ethnicity FROM teachers t 8. Join the Schools table to your query, matching schoolid from the Teachers table to school_number from the Schools table, and add s.name to your select statement. SELECT t.lastfirst, t.ethnicity, s.name FROM teachers t JOIN schools s ON t.schoolid=s.school_number Save this query and open another. Copyright 2012 Pearson Page 3
Now you need a count of students by school and enrollment status. 9. Start by leaving the SELECT clause vague, and joining together the tables you need. SELECT s.lastfirst FROM schools sc JOIN students s ON s.schoolid=sc.school_number The GROUP BY clause is used in collaboration with the SELECT statement to arrange identical data into groups. It is also used in conjunction with aggregate functions such as AVG, MAX, MIN, SUM, and COUNT. The GROUP BY clause follows the WHERE clause in a SELECT statement and precedes the ORDER BY clause, if used. GROUP BY differs from ORDER BY, which places the SELECT results in the proper sort order. 10. Since you want a count by two grouping factors, add in a GROUP BY clause. This will return results grouping each school by name and the number of active students enrolled. SELECT COUNT(s.id) FROM schools sc JOIN students s ON s.schoolid=sc.school_number GROUP BY sc.name, s.enroll_status 11. Specify the data to display. Add a count and include the fields in the GROUP BY clause. Add an ORDER BY clause to sort by enroll_status values. SELECT COUNT(s.id), sc.name, s.enroll_status FROM schools sc JOIN students s ON s.schoolid=sc.school_number GROUP BY sc.name, s.enroll_status ORDER BY s.enroll_status What does the resulting data look like? How does it change if you reverse the order of the fields in the GROUP BY? 12. Save the results data output to your desktop. Name the file Student Count. Locate the file and change the file extension to.txt. Open the file with your spreadsheet program to work with it further. PowerViews PowerViews are what PowerSchool calls views. Views are not actual tables but rather the result of complex SQL queries designed by developers to make querying and reporting easier. Views are based on pre-built SQL queries that are run automatically by the database engine. They are dynamic and change as the tables they are built from change. PowerViews combine the most requested data elements into a single table view. This pre-queried data gives you easy access to information without having to create complex SQL queries. You can query PowerViews as you would a table, but their names are long and unusual. You can find information on the different PowerViews and what data they contain in the PowerViews Data Dictionary. You can write queries of PowerViews as though they were regular tables, except that you must insert ps. in front of the name. Some data is more accessible in PowerViews than in the regular tables. This is particularly true of data in the Gen table or custom field data. Using PowerViews to get data from custom fields is covered in the Advanced SQL class. Copyright 2012 Pearson Page 4
Activity 3 Using PowerViews in SQL Queries Enhance some of your previous queries with data from PowerViews. These activities are only a sample of what you can do with PowerViews. 1. With PowerViews, you can add ethnicity data to an analysis of grades more easily. Add in a JOIN to Student Demographics, and then add the ethnicity name to the SELECT clause. SELECT s.lastfirst, s.gender, s.grade_level, d.ethnicity_name, sg.course_name, sg.grade "Letter Grade", sg.termid s JOIN storedgrades sg ON s.id=sg.studentid JOIN ps.pssis_student_demographics d ON d.student_number=s.student_number WHERE s.schoolid=100 AND sg.termid>=2200 ORDER BY s.lastfirst You can also add Special Programs enrollments using the correct PowerView. But because not every student has a record in that PowerView, you must use a special type of JOIN. There are two types of SQL JOINs INNER JOINs and OUTER JOINs. An INNER JOIN is the most common join and represents the default join type. If you don't put INNER or OUTER in front of the SQL JOIN keyword, then INNER JOIN is used. The LEFT OUTER JOIN selects all the valid rows from the first table listed after the FROM clause, and any valid rows that have matches in the second table. So in this case, use a LEFT OUTER JOIN to get all student records and any matching Special Programs enrollment records. LEFT and RIGHT JOINs are explained in greater detail in the Advanced SQL course. 2. Add a LEFT OUTER JOIN to the Special Programs Enrollment PowerView. SELECT s.lastfirst, s.gender, s.grade_level, d.ethnicity_name, sp.program_name, sg.course_name, sg.grade "Letter Grade", sg.termid s JOIN storedgrades sg ON s.id=sg.studentid JOIN ps.pssis_student_demographics d ON d.student_number=s.student_number LEFT OUTER JOIN ps.pssis_special_pgm_enrollments sp ON sp.student_number=s.student_number WHERE s.schoolid=100 AND sg.termid>=2200 ORDER BY s.lastfirst How would the results differ if you used an INNER JOIN? Copyright 2012 Pearson Page 5
Key Points Primary Keys and Foreign Keys Define table relationships Quick Export, DDE, or SQL Use the right tool Basic SQL SELECT [somestuff] FROM [sometable] WHERE Filters search results JOIN Combines multiple tables together with ON statements PowerViews Saves time by connecting frequently used data into one virtual table Copyright 2012 Pearson Page 6