The Commons

Back to Results

Patent Title: Clustering mixed attribute patterns

Assignee: IBM
Patent Number: US6260038
Issue Date: 07-10-2001
Application Number:
File Date:09-13-1999

Abstract: A technique for clustering data points in a data set that is arranged as a matrix having n objects and m attributes. Each categorical attribute of the data set is converted to a 1-of-p representation of the categorical attribute. A converted data set A is formed based on the data set and the 1-of-p representation for each categorical attribute. The converted data set A is compressed using, for example, a Goal Directed Projection compression technique or a Singular Value Decomposition compression technique, to obtain q basis vectors, with q being defined to be at least m+1. The transformed data set is projected onto the q basis vectors to form a data matrix having at least one vector, with each vector having q dimensions. Lastly, a clustering technique is performed on the data matrix having vectors having q dimensions.


Link to USPTO

IBM Pledge dated 1/11/2005