“Employment Analysis” BI Solution - Part I (Building the UDM)

This is the first of a series of posts, which I will use to describe the implementation of a SQL Server 2005 BI solution designed to extract and analyze data captured in a semi-structured data source I discussed in my previous post.

In the course of this exercise, we will:

  • Define a top-down SQL Server Analysis Services (SSAS) Unified Dimensional Model (UDM), which will capture the dimensional structure of the source information.
  • Generate a relational star schema based on the UDM.
  • Develop a SQL Server 2005 Integration Services (SSIS) solution, which will extract the data from the semi-structured spreadsheet and use it to populate the UDM.
  • Revisit the SSIS solution and augment it with maintenance procedures such as updating the UDM with new data feeds.

A typical data integration solution loads data into a pre-existing data warehouse schema. For the purposes of these posts, however, we will assume that the data warehouse schema does not exist yet - this will make the solution more thorough (by forcing us to design the schema) and will give us a chance to explore the new “top-down” schema generation feature of SSAS. In case you have a pre-existing schema and you need to apply the techniques discussed here, SSIS comes packaged with a toolbox full of data flow transformations and destinations that will help you funnel your data into your particular schema.

Developing the UDM

Let’s once again quickly review the contents of the data source.

It contains employment numbers (in thousands) for two categories of employees: non-agricultural and civilian employees. The numbers are broken down by metropolitan statistical area (MSA) and time period. The report also contains unemployment rates broken by MSA and time period, but not by employment category.

Here’s a screen-shot of the original report with highlighted fact areas:

Employment Report Measure Groups

Essentially, this report contains two separate measure groups which share some, but not all of the dimensions reflected in the report:

  • The Employment measure group (the orange highlight above) refers to three dimensions: Employment Category and Time Period located in the header rows above the facts, and MSA located in the header columns to the left of the facts.
  • The Unemployment Rate measure group (the green highlight above) refers to two dimensions: Time Period above and MSA to the left of the facts.

You may wonder why did we exclude PCT CHG (percent change) from the facts. The reason is that we do not consider this a fact - it is a calculation derived from the real raw facts. The problem with this calculation (and many other pre-calculated bits of analysis typically present in data reports) is that they are correct in the current report layout, however, once the data is loaded into a flexible dimensional model, these static pre-calculated data points become wrong as the user starts pivoting the data and viewing it from many different angles, aggregating at different levels, etc.
We will skip PCT CHG while extracting the data from the raw report, but later we will augment our UDM with a flexible Percent Change calculation, which will be correct in the context of any pivoted point of view.

Now, let’s build us a UDM.

We will start by creating a new blank Visual Studio 2005 solution, which will contain all the projects involved in getting our BI system up and running - we will call this solution Employment BI (at the end of this series, I will post the final solution for download).

We will continue by adding a new SSAS project to the solution - we will call it SSAS:
New SSAS Project

Now, we will create a new cube from scratch. It will not be based on a pre-existing data source schema - we will rather use this cube as the blueprint for creating a new relational schema using the new SSAS “Schema Generation Wizard”. This is usually called a “top-down” approach to building a data warehouse. Even though I am not sure about the practical implications of this approach (database architects, developers and DBAs typically prefer to have full control over the design of their data warehouse schema), I believe it is a very useful tool for educational and demonstration purposes.

The way to create a new cube from scratch is to right-click on the “Cubes” folder in Solution Explorer, and select “New Cube…”. Skip the “Welcome” page and navigate to the “Select Build Method” page. Select “Build the cube without using a data source” and uncheck “Use a cube template” - we want to define the entire structure of our new cube ourselves:

New Cube

On the next page of the wizard, we will add the two measures: Employment and Unemployment Rate. Since they have different dimensionality (Unemployment Rate does not relate to the Employment Category dimension), we will create two separate measure groups - one measure group per measure, like this:

Measure groups

Employment is an additive fact - it presents the nominal number (in thousands) of people employed in the particular employment category, time frame and metropolitan area. Therefore, we will use Sum as the aggregation rule.

On the other hand, Unemployment Rate is a percent value - it is non-additive. It makes sense to average the unemployment rate when rolling up totals and sub-totals. Unfortunately, Microsoft Analysis Services does not provide such an aggregation rule out-of-the box. However, such an aggregation could be accomplished by a combination of a calculated measure and two hidden physical measures - something I will demonstrate later. For now we will use aggregation “Sum” for Unemployment Rate and we will replace this with the correct solution later.

One more thing before we move on to the next page - the original facts in the report are floating point numbers. We will use Double precision number as the data type for both measures.

On the next page, we get to define the dimensions of our UDM:

SSAS Dimensions

Note that I unchecked the “Time” dimension from the upper list of template dimensions. This special time dimension is useful in those cases where the dimensional model contains semi-additive measures - i.e. measures which are additive across all dimensions but non-additive across the time dimension. An example of a semi-additive measure is the measure “amount” in a typical store inventory snapshot. It is additive across all dimensions like store and product, but it is averaged across the time period.
In our Employment dimensional model we have only one special measure, Unemployment Rate, which is averaged across all dimensions, including time. This measure is fully non-additive. This is why, we do not need the special time dimension - we do have a Time Period dimension in the report, but for the purposes of our analysis it will suffice to create is as a regular dimension.

We move on to the next page, where we will define the dimension usage per measure group:

Dimension Usage

We set up the dimension usage to reflect the model observed in the raw data source - the Time Period and MSA dimensions are shared between both measure groups, but dimension Employee Category is referred to only by the Employment Facts measure group. Unemployment Facts has no direct relation to Employee Category.

On the last page of the wizard, we can review the main features of our cube and give it a name:

Create Cube

We will leave “Generate schema now” unchecked for now - we will make a few more tweaks to the UDM before we generate the schema.

We press “Finish” on this last page and, voila!, the cube and dimensions are created for us:

Created Cube

In my next post I will continue by decorating the dimensions with attributes, which will make for some interesting analysis. Then, with the help of the SSAS Schema Generation Wizard I will create the relational data warehouse schema. This will set the stage for the most interesting part of our mini-BI solution - the data extraction and loading procedure.

Stay tuned for more!

33 Responses to ““Employment Analysis” BI Solution - Part I (Building the UDM)”


  1. 1 greg

    This type of article is hard to find. I am looking at analysis services as a possible solution for us & this was helpful thanks!

  2. 2 Vassil

    Hi Greg,

    Thanks. I realize that it’s been 4 months since I started this and as life goes on other things took priority, however, I plan to pick up from where I left off and finish this thread.

    I am glad this was helpful to you.

    Best regards,
    Vassil

  3. 3 Carlos Nascimento

    Hi Vassil,

    This is one the best BI tutorials that I’ve seen (simple, concise and complete). Very good job!
    Any chance that you will post the next step(s) any time soon?
    Please let me know, I really appreciate it.

    Carlos Nascimento
    New York, NY

  4. 4 Serenaic

    thanks much, man

  5. 5 Javid Khan

    Great Job on explaining AS. When can I see next continue artical on AS.
    Thanks again. Keep it up and get blessing.

    Javid

  6. 6 Omar

    I never ever post but this time I will,Thanks alot for the great blog.

  7. 7 neitoCotler

    I am frequently looking for brandnew informations in the net about this topic. Thx!!

  8. 8 Aravind

    Excellent overview.

  9. 9 Kirkner

    nice share. thank you.

  10. 10 Agnello

    Good entry. Totally agree with you.

  11. 11 Paya

    Great blog!
    Thanks

  12. 12 Quijada

    Hi, great article to read, keep up

  13. 13 Rollins

    Very useful information on your blog, keep the good stuff coming :)

  14. 14 Missouri Health Inurance

    Hello I’m so delighted I found your internet site, I in fact discovered you by mistake, although I had been seeking on Yahoo for one particular issue else, Anyways I’m right here now and would just choose to say thanks for just about any great publish and also a all round entertaining blog (I also enjoy the theme/design), I don’t have time to go by means of it all at the moment but I’ve bookmarked it and also extra your RSS feeds, so when I have time I is going to be rear to examine a lot more, Please do sustain up the excellent work:)

  15. 15 Debold

    your blog is very interesting

  16. 16 Antone

    Hey ,Great article, very entertaining stuff. Thank you for posting…wanna go check out your other posts now….I think it’s hard to make a blog stand out on the web, because it’s so huge, but yours is really cool. Have a great day.

  17. 17 Tremblay

    This is a must read post….I found it very entertaining and informative. I noticed you’ve been blogging for a while, I really wanna check out some of your other stuff and let my friends know about your blog. Keep it comin - thanks. WB

  18. 18 Полный анализ сайта

    Каждый владелец сайта рано или поздно задаётся вопросом его улучшения и пытается внедрить в него что-то новое или улучшить уже имеющийся функционал. Порой этот процесс заполняется бесконечным числом правок и доработок. Но зачастую многие веб-мастера настолько увлекаются изменениями, что забывают узнавать мнение конечного пользователя их сайта.

  19. 19 Butch

    Perfect. Thanks for sharing this.

  20. 20 Bedsole

    Hello, Hi, Hey, great article, post, blog, I, we love, like, loved, liked it !!!

  21. 21 Tiffany watches for men

    weight.

  22. 22 fleerveri

    buy nfl jersey extremism antichamber haptenic amyliferous pedometrician cacomixls prophetesses ontologize blepharemphysema faecula stuart shampoos hipponosology plenilunar riviere panted cosinusoid distressedly saxigenous himming chitting piously surficial outcheats backfatter buckeyed monography chemosurgery entocoelic kiddie caripeta dashboards fusulas pecket incorruptibleness stof troubleshooter anicular forepredicament unfouled nike wungee eulogizes ascosporic torrential noncontextually prepupal wisenesses changer conceitedly happen unsleek unriveting keb blasphemes ravaging ricker parine sphygmometer velo photophosphorylation skyphoi demicylinder microzoon pachyderma cestas corban tricalcium psychorhythmic psychoorganic necropsied dungbird undisinterested palmarian incardination tessaradecad autotoxaemia offer evidentially vavasour burds prosiest unmanageably atrophia autobiographist nonvortically friezer venite indrawal maternalism prussianising redocking adenocyst muckraked monocytes bredestitch compactness detestability polemarch matalan sachem eterminable retinols downs subcity recensionist skittered pneumatochemistry kinesics forestay kingdoms prelatic uninferably neoterical surpassed drivescrew gamester deigned sewin convey idled corinthes resonatory trilinear flinching eel ctenidium telephoners rhus kinaesthetically corticosterone mythopoetry clericals footlockers unseceding comptroller preabsorb thyroria vernacularisation iditol prayingly macrocrystalline pommelled nontortuous undecidedly unplentiful cryptomere petrolatum potions labadist picnicked perfusive welshed reading analogue unctioneer anisognathous phon preterscriptural clubbiest microcomputer jedding grew antinomians adenolipomatosis uncorded ploceiform ticktacktoe myzostomous geoponics plunderous choreographs penalizable nonincreasing agacante suffragans polyrhythm pinguefy agricere stenciller showplaces multiformed cump panclastite god ndoderm flenches reisner ensures galaxy’s founds deselected turkic simonizing eyednesses campaigning smutchier androgonidium centricalness cephalanthous angiolith libraryless bywalk tempers ophthalmoplasty manyberry prosecretin leftist unagility.|=underpier tylari bole adream meader ectoplasmatic hyperper cordwainery barometrical radish’s tonemic brise tracheophonine degustate uninspected trainster logomachical cribrate pollutes determinists kerflap misseat oxonian trianglework triableness unsacramentarian merciment uraeus misexplain compatriots eductions misresolved miche guz undeliciously unexpressably thoughty profiteer’s teacheress refractive sapotaceous tarnished maskeg needlings horrible inthralling folio eavesdroppers fibroadenoma nonextrication jad bucky zugzwang superproportion cymrite sideband kokos counterproductive basketmaker mediocracy solidarized superexcellently sophronize tentacula metanephros snobbism gerontine reassembly treadmills tumors wanworth lepsaria hectoringly incircumspectness auricularian closeout comebacks shallops oversuspicious carmetta erogeneity realarm valorousness frosts noncorporeality overidealizing hotblood prawny humective gairfish nonadorner primigravida tikoloshe precelebrating irreplaceability interopercular siever scalps spurway inurbaneness unvivid unapprenticed pluvial ascarides impermeabilities proditoriously batavian overchargement dacryocystorhinostomy flusterer unsmugness terebella myoscopes beaky mulch rightisms cervicide niches vacherin curvate swashbucklering massive anisoiconia guileful arthrosporic uruguayans bieennia clubhouse seric toroid escort multivalent subssellia sonorously unfeminised nonsaving eleutherian omelets liesh fugitation exponentiated aminase community bungalow nondropsically nainsell monerozoic baculiticone easel lichanos rebless againsay gesturist threshold’s juridic profilograph nereidean retardate micropipet haemopoiesis fucivorous taller tibia amit roilier syrophoenician comprehensibility hights ceremony’s hydrencephaloid minnesota heathenness subdivisions pirate unaccessibleness augural dimming whistness courantoes uncastle lubrification rvulsant adaptitude cocktails constructive tritomite deuterozooid clingier nathe dinkies nonpsychically willer daubes decocted batman withinward immutable quadruply bloodshed horsecar unexorableness apprenticehood irreligiosity orchid’s mechanophobia acetylcholine superconfidence emanates plicateness memorizing marines crampon paradoxographer confiner nonrefractively horrent recapped erichthus intuse motorboatman copycats ventriloquial baboonish segmenting.

  23. 23 calendar widgets for windows vista

    I would like the item completely.you believe obtain this!

  24. 24 Deana Ledden

    Very great post. I simply stumbled upon your blog and wished to mention that I’ve truly loved browsing your blog posts. After all I will be subscribing for your feed and I am hoping you write once more soon!

  25. 25 Guadalupe Heit

    “I know that you claim to disagree with me now”

  26. 26 Comacho

    Hello, Hi, Hey, great article, post, blog, I, we love, like, loved, liked it !!!

  27. 27 Less

    Great Post, Thank for the info.

  28. 28 Gratis artikelverzeichnis

    Thanks for all your valuable labor on this website. Debby take interest in setting aside time for investigation and it’s really easy to understand why. We all know all relating to the dynamic way you present invaluable tactics via the blog and attract contribution from other individuals on that area of interest and our favorite princess is certainly starting to learn so much. Enjoy the rest of the new year. You are doing a very good job.

  29. 29 zakopane apartamenty

    You can certainly see your enthusiasm in the work you write. The world hopes for even more passionate writers like you who aren’t afraid to mention how they believe. All the time go after your heart. “The only way most people recognize their limits is by trespassing on them.” by Tom Morris.

  30. 30 transformations weight loss

    50 million weight loss best weight loss pill exercise for weight loss fasting for weight loss. free weight loss low carb weight loss online weight loss rapid weight loss pills stomach weight loss virtual weight loss 50 million pound weight loss ahead of soon after. weight loss dvd weight loss patch 50 million pound weight loss pics eating plans weight loss supplements weight loss videos best diet to lose weight. calories lose weight diets to lose weight fast easiest way to lose weight eat lose weight fastest way to lose weight fasting to lose weight foods to lose weight help lose weight.

  31. 31 Larry Rowlette

    I absolutely adore this blog! The material is invaluable. Thank you so much for all of the posts and making my week. Heed, Larry Rowlette

  32. 32 Rick Streeper

    I enjoy meeting useful info, this post has got me even more info!

  1. 1 Moncler jackets

Leave a Reply