Microsoft held it's first ever Business Intelligence Conference in Seattle, WA this past week. For more details on the conference visit www.microsoftbiconference.com.
I will post my view of the conference and some of the great experiences I had while I was there!
My journey in learning new ways to leverage Collaboration and Business Intelligence to improve business process.
Sunday, May 13, 2007
SQL Server 2005 Data Mining
I just completed a great project for a major motorcycle manufacturer where I used SQL Server 2005 Data mining to develop a Cross Selling application.
•This customer produces heavyweight street, custom and touring motor sport vehicles and offers a complete line of parts, accessories, apparel and general merchandise.
•Increase market penetration by providing features unique to solution
•When a customer requests parts or accessories, they have to refer to a printed catalog and find the respective vehicle models and part numbers. Once the parts have been identified, the sales person then has to look up the part to identify availability.
•Definition
–Act of excavation in the data from which patterns can be extracted
–Multiple disciplines: database, statistics, artificial intelligence
•Example Data Mining Problems
–Is this person likely to default on a loan?
–Is this email spam?
–What is the income of a customer?
–Are these lab results normal?
–What other products are purchased with a particular product?
•Algorithms
–Decision trees
–Naïve Bayesian
–Neural Network
–Association Rules
–Sequence Clustering
–Time Series
–Text Mining
•Application Interface
–DMX – Standard DM query language
–OLE DB for DMX
- .NET 2.0
•Collect input data sets
–Determine what relevant data is available
–Design SQL Structures where needed and collect data into a single data set
–OEM questions required a product hierarchy to be established
–Part catalog had to be determined based on invoices which inconsistent data (VIN field is optional)
–OEM data had inconsistencies around models (i.e. special edition bikes) and how parts were sold (i.e. kits vs individual parts)
•Verify data integrity
–Remove or cleanse records containing bad data
•Ex: invalid VIN number or non-OEM parts
•This customer produces heavyweight street, custom and touring motor sport vehicles and offers a complete line of parts, accessories, apparel and general merchandise.
•Increase market penetration by providing features unique to solution
•When a customer requests parts or accessories, they have to refer to a printed catalog and find the respective vehicle models and part numbers. Once the parts have been identified, the sales person then has to look up the part to identify availability.
•Definition
–Act of excavation in the data from which patterns can be extracted
–Multiple disciplines: database, statistics, artificial intelligence
•Example Data Mining Problems
–Is this person likely to default on a loan?
–Is this email spam?
–What is the income of a customer?
–Are these lab results normal?
–What other products are purchased with a particular product?
•Algorithms
–Decision trees
–Naïve Bayesian
–Neural Network
–Association Rules
–Sequence Clustering
–Time Series
–Text Mining
•Application Interface
–DMX – Standard DM query language
–OLE DB for DMX
- .NET 2.0
•Collect input data sets
–Determine what relevant data is available
–Design SQL Structures where needed and collect data into a single data set
–OEM questions required a product hierarchy to be established
–Part catalog had to be determined based on invoices which inconsistent data (VIN field is optional)
–OEM data had inconsistencies around models (i.e. special edition bikes) and how parts were sold (i.e. kits vs individual parts)
•Verify data integrity
–Remove or cleanse records containing bad data
•Ex: invalid VIN number or non-OEM parts
Subscribe to:
Posts (Atom)