Category
📚
LearningTranscript
00:00Let's dive a little bit deeper into data connections by showing some examples of local file, database, and Tableau server connections.
00:08When you connect to local flat files, PrepBuilder will show tabs for Excel files and can union files within a given directory.
00:17This includes CSV and Excel files.
00:20You'll also be able to leverage Data Interpreter for these flat files.
00:24This is similar to Tableau's Data Interpreter tool, which will automatically clean up different pieces of data inaccuracies like extra rows, column headers, and more.
00:34In addition to local files, you'll also be able to connect to databases.
00:39When you connect to a database, you have to enter credentials in order to be able to access the different schemas, tables, and views within that database.
00:48Note that you first have to be given permission and access to those databases before you can connect to them, but all of the standard connectors are available inside Tableau Prep.
00:57Note that Data Interpreter is not available for database connections.
01:01Finally, you'll be able to connect to Tableau server published data sources.
01:06When you connect to Tableau server, you'll need to enter your server credentials, or use single sign-on, to view all of your published data sources, tables, and files.
01:16Again, you'll only be able to see what you have access to.
01:20So if you have access to a published data source, you'll be able to connect to it inside Tableau Prep and use it as an input for your flow.
01:28Note that Data Interpreter is not available for Tableau server connections inside Tableau Prep.
01:34Now that we've talked high-level about the different data connection types, let's dive in and actually connect to some data sources inside Tableau Prep.
01:42Alright, so we're inside Tableau Prep.
01:45First up, we're going to demo some basic connections.
01:48Then, we'll dive straight into building our data input connections for our main workflow.
01:54Now the initial view that you'll see inside Tableau Prep will look very similar to the initial view that you see inside Tableau Desktop.
02:01On the left-hand pane, you'll have your data connections, which you'll be able to expand and search through.
02:06In the middle, you'll have recent flows that you've opened, very similar to your recent workbooks in Tableau Desktop.
02:13And on the right-hand side, you'll have your Discover pane, which links you to the Tableau community.
02:18For now, though, we're going to connect to a simple Excel file to begin our journey connecting to data inside Tableau Prep.
02:25Let's go ahead and click Connect to Data and search for Microsoft Excel.
02:30Now you can follow along if you'd like, because we will be connecting to one of the sources that we're going to use in the main flow.
02:35But you won't need to save this part of the work, as we'll be connecting to very specific data sources for our main flow.
02:41And this is just a beginning example of how to connect to Microsoft Excel flat files.
02:46Let's go ahead and click on Microsoft Excel, and then we'll navigate to the file that we want to connect to.
02:53In this case, we want to connect to Plans for High School Grad District 2018-2019.
02:59Let's click Open, and our flow will begin to load.
03:04You'll notice that we get our initial view, which will have a small profile of our data input, what the field names are,
03:10and some sample values of our data that we're connected to.
03:13On the left-hand side, we'll be able to see the tabs within our Excel file,
03:18and we'll also be able to see our flow that starts developing up here in the flow pane.
03:23Now right off the bat, we can see that there's an issue with our connection.
03:27Our fields appear to not have proper naming.
03:30This is because we have extra rows at the top of our Excel sheets that have no value.
03:35Our real data starts a few lines down from the top, but how do we dynamically fix that inside Tableau Prep?
03:42Do we need to go back to the Excel file and clean that up on our own?
03:46The answer is no.
03:47We can do that with the use of Data Interpreter.
03:50Again, this is only available for Excel files, but it can be very helpful for those simple things that will complicate your data connections.
03:58Let's go ahead and click Use Data Interpreter, and you can see our data has been cleaned up.
04:03Our extra rows at the top of our input have been removed, our field names are normalized, and our sample values are cleaned up.
04:12Now we're going to dive much deeper into all parts of data input, but for now we just wanted to show a quick connection to Excel so that you can get your feet wet with inputting data into Tableau Prep.
04:23Up next, I'll be demoing connections to databases.
04:27So for this example, just follow along and watch me connect to a database, as yours will be configured for your specific database type and your credentials.
04:36But the general process will be the same.
04:39Now let's go ahead and add a new connection, and we'll navigate to our data source.
04:45I'm going to want to connect to PostgreSQL because I have a PostgreSQL server on my computer.
04:51Once your database connection dialog box pops up, you'll just need to enter your credentials.
04:56In this case, with PostgreSQL, I need to enter my server, my port, my database, and my username and password.
05:03Once I've entered my credentials, I can click sign in, and my database will now be connected.
05:08You can see that I have my Excel connection and my PostgreSQL connection.
05:12If I go into the flow pane, I can drag out any of the tables that are available in my PostgreSQL database.
05:19You can see that I have a few here, but I'm just going to drag one out into my view and drop it.
05:25You'll see a new connection will form, and I'll be able to, again, profile my data
05:30and look at my different fields, my data types, my keys, and more.
05:36Again, we'll dive into the intricacies of the input step a bit more in the next few lessons,
05:41but this is just an example of how to connect to a database.
05:45Up next, I'll be demoing connection to a Tableau server input.
05:49Again, if you don't have a Tableau server, no worries.
05:53You can just look how I'm connecting, and when you get to that step at work or at home,
05:58then you'll know the basics of connecting to that data source.
06:02I'm going to go ahead and click on my Add Data Source icon here.
06:06I'm going to go down to my Tableau server option.
06:09I'm going to paste my Tableau server URL in and hit Connect,
06:13and then I'll be prompted for my email and password to my Tableau server.
06:17I'll simply put in my credentials and choose Sign In,
06:21and from there I'll be prompted with published data sources that are available on my Tableau server.
06:26Let's go ahead and click on my Homework 5 output and choose Connect.
06:31As you can see, the data source comes in just like any other data source that I've connected to,
06:37be it Excel, a database, or Tableau server published connection.
06:42From here, I'll be able to bring in any of these sources into my flow and do data manipulation along the way.
06:48Again, there will be different specifics for which databases or flat files or Tableau servers you're connecting to,
06:54but generally, the connection is pretty straightforward regardless of what you're connecting to.
06:59Now that we've looked at our base examples for connecting to local files, databases, and Tableau server published sources,
07:07let's go ahead and start building the base of our workflow for the main course.
07:12I'm going to go ahead and close this prep flow and open a fresh one so that we can get started on our main flow.
07:18I'd advise you to do the same just so that you start from a clean slate.
07:22Alright, so I've opened up a fresh Tableau prep instance so that we can start building our main flow.
07:28Let's go ahead and connect the data just as we did in the prior examples.
07:32Now the files that we're going to be using for our flow are both Excel and CSV.
07:38We won't be connecting to any Tableau published data sources or databases for the duration of the course.
07:44Although we will do examples to show how it's done, but we won't use them for our main build.
07:49Let's go ahead and go down to Microsoft Excel and connect to our first source.
07:54As you look through the sources that are contained in the package file that you've opened for the main course,
08:00you'll see that we have a myriad of Excel and CSV sources.
08:04And most of these sources are split between either school-level data or district-level data.
08:10You'll see this theme throughout the course.
08:12Our flow will start with two main sections, one related to school-level data and one related to district-level data.
08:19As we combine and transform the data, some of these will be pulled together and some will be separated out.
08:26To start, let's work on the school-level connections.
08:29So navigate through the files that we've been given and go to the plans for grad files.
08:34Specifically, you're going to want to select plans for high school grad school underscore 2018-2019.
08:42This is the plans for high school grad school data source for the 2018-2019 school year.
08:49Most of these are split school years because they're based in the United States,
08:53where the school year starts in the fall and ends in the spring.
08:57Let's go ahead and click on the file and choose Open.
09:00Now our data will automatically be pulled in because there's only one tab in our Excel source.
09:05You'll see that it's been added to our flow pane and we've got a brief preview of what the data looks like.
09:11Like our prior example, our headers aren't coming in clean because of those extra rows at the top of our Excel file.
09:17So we're going to want to choose Use Data Interpreter to clean that up.
09:22Now there's some more things that we're going to want to do to clean this data upon input,
09:26but we're going to save those for the next lesson.
09:29For now, we just want to get all of our sources into the flow so that we can start working with them.
09:34Let's go ahead and hit the plus sign and add another Excel source.
09:39We're going to click Excel and then we're going to be looking for the Educator Evaluation Performance data source that's school level.
09:47So we want to select Educator Eval Perf underscore School 2017 2018.
09:54Let's go ahead and click Open and our source will be brought into the view.
09:58Again, we'll want to choose Use Data Interpreter to clean our headers up because those aren't coming in correctly.
10:05And you'll see those will get cleaned up right away.
10:08Let's go ahead and add another data source to the view.
10:11We'll hit the plus sign, go down to Excel, and now we're going to navigate to the SAT Performance Report School Level Report.
10:19So we'll go to SAT Performance School underscore 2018 2019.
10:26I'm going to hit Open.
10:28My data source will automatically come into the view.
10:31My headers, again, are not coming in clean.
10:34Let's clean that up with Data Interpreter.
10:36And those are fixed.
10:38Alright, let's go ahead and add our next source.
10:40So hit the plus sign, choose Microsoft Excel,
10:44and we're going to want to navigate to the Per Pupil Expenditures for the district level.
10:50So we can see Per Pupil Expenditures District, and we're going to pull in the 2018 file.
10:56Let's click Open, and our data source will come in.
11:00Again, we'll use Data Interpreter to clean our headers up, and there we go.
11:04Let's go ahead and add another source.
11:07We'll choose Microsoft Excel,
11:09and we're going to want the Advanced Course Completion District Level Source.
11:14You can see Advanced Course Completion District 2019.
11:18Let's go ahead and click Open, and the data source will come into the view.
11:23We'll choose Data Interpreter to clean that up, and that's effectively cleaned up our headers.
11:28Let's go ahead and click Add New Connection, and this time we're going to choose Text File,
11:33because we're going to be bringing in a CSV.
11:36Let's choose Text File, and we're going to navigate to Teacher Salaries District 2017-2018.
11:44Let's click Open, and our teacher salaries data will be added to the view.
11:49Alright, so we're going to be cleaning up this data a little bit later in the flow,
11:53but let's take a look at what we brought in.
11:56It seems like we're missing one data source that we're going to need that's school level,
12:01but we've already brought in our school and our district level sources.
12:05Let's take a quick look at how to clean this up a little bit so that we can fit that data source in.
12:10Let's go ahead and highlight our inputs, drag them down, and drop them.
12:16Now we've got a nice space to add our school level data source.
12:20Let's click the plus sign, choose Microsoft Excel, and we're going to bring in our Teacher Data School 2019-2020.
12:30Let's click Open, and our teacher data source will come in right below our other school level data sources.
12:39If we bring our view up, we can see that we need to use Data Interpreter again to clean up our file.
12:45Alright, that's looking good, and we can quickly see that we have all seven sources that we're going to need for our flow.
12:53Now again, there's going to be a lot more cleanup and adding of data, but for now, this is a great start.
12:59Be sure to save your flow by hitting the Save Flow icon before moving on to the next step.
13:05I'm going to go ahead and name my flow Connecting to Data Complete.
13:11As I'll save this off as a separate version that you can look at later if you need to skip ahead and see how something's been done inside the course.
13:20I'll click Save, and now my flow is saved.
13:24Next up, we'll be looking at wildcard unions for our flat file ingestion.
13:29Thanks, guys .