Due: Friday, October 29, 2021
Reading from Python Data Science Handbook by Jake VanderPlas: Chapter 3, Data Manipulation with Pandas. The earlier Chapter 2, Introduction to NumPy, might also be of interest.
For this assignment you will write a Python script that uses the Pandas library to fetch data about variable stars from an SQL server and compute the altitude of each star above the horizon based on the star's position on the sky, the location of the observer, and the date/time.
Proceed as follows:
Download and install the Microsoft ODBC driver for MSSQL Server (Windows, macOS, Linux).
Using ODBC Data Source Administrator tool, configure a datasource for accessing the VariableStar database. Use the following attributes:
ODBC Driver 17 for SQL Server Name : VariableStars Server: algol.database.windows.net "With SQL Server authentication using login ID and password" Username: student Password: ******* Change to default database: VariableStars
The VariableStars.py program in the samples repository shows the specifics of how to connect to the DBMS. In particular, use the connection string: "mssql+pyodbc://student:2frenchfry!@VariableStars". Here the name "VariableStars" must match the name of the datasource you previously configured. The username (student) and password (*******) are part of the connection string (note this account has read-only access to the data.
We will define the observable stars in a constellation as those with a brightness variation of at least 1.0 magnitudes and where the minimum brightness is magnitude 10.0 or brighter (observable in good binoculars). For the constellation of Cassiopeia, the following SQL query returns such stars:
SELECT name, const, ra, dec, max_bright, min_bright FROM star WHERE const = 'Cas' and min_bright - max_bright >= 1.0 and min_bright <= 10.0;
For clarity, the query above is shown on three lines (which is fine), but it could also be a single line. The terminating semicolon is important. Also, single quotes are required around the constellation's abbreviation (Cas).
Here the 'ra' and 'dec' columns are the right ascension (in hours) and the declination (in degrees) of the star on the celestial sphere.
Write a program, homework-03.py that connects to the database, issues the query above, and prints the DataFrame returned by Pandas containing the raw data. Modify the program so that it prompts the user for the constellation abbreviation. Other interesting constellations to try include: 'Ori' (Orion), 'Aur' (Auriga), and 'CMa' (Canis Major). These are all constellations that are visible in the evening sky during this time of year (Northern Hemisphere). In every case the number of stars that satisfy the query is small and manageable.
The file star-functions-mssql.sql in the samples repository for this course (in the homework folder) contains a number of functions that were designed to execute on the MSSQL server itself. These functions compute the altitude above the horizon in degrees for a given star using the star's coordinates, the location of the observer, and the date/time when the observation was made as input.
Modify homework-03.py to accept the GPS coordinates of the user and a date/time from the user as input. Then use that information to compute the altitudes of all the stars returned by the previous SQL query. Use Pandas to do these computations in parallel (i.e., for every star at once without explicit loops in your program). Display the results.
Submit your homework-03.py to Canvas.
Last Revised: 2025-01-09
© Copyright 2025 by Peter Chapin <peter.chapin@vermontstate.edu>