Placeholder Image

Subtitles section Play video

  • Ok.

  • Great!

  • Now, we understand why people use databases.

  • However, I would imagine some of you are Excel users and still imagine an Excel spreadsheet

  • when we talk about tabular data.

  • Please, don’t do that!

  • Data tables, databases, and Excel spreadsheets are different things!

  • In this lecture, we will focus on the differences between spreadsheets and databases.

  • This exercise will be relevant, not only for current Excel users.

  • Those of you who do not use Excel regularly will still have the chance to understand the

  • advantages and the disadvantages of using databases or spreadsheets.

  • Ok.

  • Let’s start with a definition What is a spreadsheet?

  • It is an electronic ledger, an electronic version of paper accounting worksheets.

  • It was created to facilitate people who needed to store their accounting information in tabular

  • form digitally.

  • So, it is possible to create tables in a spreadsheet.

  • This is one reason some people believe spreadsheets and databases are interchangeable, while,

  • in reality, they aren’t.

  • There are similarities between the two.

  • Both can contain a large amount of tabular data and can use existing data to make calculations.

  • Third, neither spreadsheets nor databases are typically used by a single person, so

  • many users will work with the data.

  • The differences between the two forms of data storage lie in the way these three characteristics

  • are implemented.

  • Ok.

  • Imagine a spreadsheet.

  • Every cell is treated as a unique entity.

  • It can store any type of information – a date, an integer value, a string name.

  • And then, not only can we have different types of values in various cells, but we can also

  • apply a specific format to these cells.

  • This is not inherent to databases.

  • They contain only raw data.

  • Each cell is a container of a single data value.

  • It is the smallest piece of information there is.

  • You must pre-set the type of data contained in a certain field.

  • This feature prevents inadvertent mistakesfor example, in a field containing date

  • values, should the user try to insert a string, the software will show an error and she will

  • have the chance to correct herself.

  • This won’t happen in Excelif you insert a string in the column with date values, you

  • wouldn’t obtain an error message, and Excel will store the string value.

  • In a spreadsheet, data can be stored in a cell, while in a database, data is stored

  • in a record of a table, meaning you must count the records in a table to express how long

  • the data table is, not the number of the cells.

  • And that is ityou cannot pick a font colour or size.

  • All you care about is the information being stored; you don’t care about formatting.

  • Our main goal is to save the numbers.

  • Another substantial difference is that, in a spreadsheet, different cells can contain

  • calculations, such as functions and formulas.

  • This means, if you want to combine two integers, the result will be stored in another cell.

  • In a database, all calculations and operations are based on the existing data and are done

  • after its retrieval.

  • There is a specific feature, calledviews”, similar to the tables, in which you can do

  • a calculation.

  • These objects also contain columns that can be normal columns like the ones in the tables

  • or could contain a certain type of calculation.

  • There is no way you can mistake a record of data with a calculation.

  • The database features mentioned so far improve data integrityyou can’t store different

  • types of data in the same field, and it is unlikely someone will mistake a data value

  • for an outcome of a calculation, especially in large data sets.

  • Data integrity is a strong advantage when working with databases.

  • Naturally, you might think a spreadsheet can contain multiple worksheets, so one can create

  • tables in the worksheets, and then use the worksheets to create relations between the

  • tables.

  • Why bother using relational databases?

  • Well, in a spreadsheet, such relations will be logically limited.

  • Instead of setting up spreadsheets or worksheets, one can set up relations between the tables,

  • and this will boost the performance of operations, increasing the speed with which you could

  • manipulate your dataset.

  • Albeit powerful for many circumstances, spreadsheets have their limitations.

  • Excel is incapable of handling over 1 million rows of data.

  • This immediately induces us to look for a solution.

  • Usually, the fix is to use databases, where having 2, 5, or 10 million records is not

  • a problem.

  • Referring to the multi-user property, spreadsheets are lagging.

  • Essentially, every person must update their own spreadsheet with new data.

  • For instance, if there is a new purchase to register or a last name in theCustomers

  • table to correct, every user must make these changes manually.

  • You would justifiably think Google Docs and the latest versions of Office solve this issue,

  • but they do so only partially.

  • In Google Docs, you might have trouble finding out who changed or deleted information incorrectly,

  • which often leads to a cumbersome situation where people have a hard time organizing their

  • tasks.

  • As opposed to that, you saw in the Data Control Language lecture that databases provide a

  • stable structure, controlling access permissions and user restrictions.

  • One person can make a change that is visible to everybody instantly.

  • This feature increases efficiency and data consistency when using databases.

  • Considering data integrity and data consistency, using databases eliminates duplicate information,

  • which is another way to save space and increase efficiency.

  • Look at theCustomerstable.

  • You know a certain first and last name corresponds to a unique email address.

  • So, if you know John McKinley has changed his email and you are using a spreadsheet

  • flooded with data, you may change the email address once and accidentally miss updating

  • the same address in another record.

  • This may lead to inadvertent mistakes.

  • They can be avoided when using a relational database - an accredited user only needs to

  • access theCustomerstable and change John McKinley’s email address there.

  • Just once.

  • Not only will this operation save time, but it will also anticipate inconsistencies.

  • So, what we discussed in this lesson highlights why databases are a better environment for

  • storing and keeping track of data when working with multiple dimensions and large amounts

  • of data.

  • Spreadsheets have their advantages as wellthey are an excellent tool that allows

  • us to carry out extensive analysis.

  • But for the easy retrieval and updating of data, efficiency, data consistency, data integrity,

  • speed, and security, relational databases are definitely the structure to opt for.

  • They can store lots of raw data and are excellent when separating the data from the way it is

  • displayed for analysis.

  • As you saw, it would be a good idea to stop trying to visualize data tables in the form

  • of spreadsheets.

  • They are different.

  • Stay tuned for the next lecture, where we will offer more database terminology.

  • Thank you for watching!

Ok.

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it