HEIKKI MANNILA
|
 |
- Professor, Aalto University, Department of Information
and Computer Science
- Vice President for Academic Affairs, Aalto University
- Docent, Department of Computer Science, University of Helsinki
Address:
Lämpömiehenkuja 2, P.O. Box 7800, FI-02015 TKK, Espoo, Finland
Phone: +358 50 511 2913, fax: +358 9 470 25742
- Email:
Heikki.Mannila@aalto.fi;
- Address at
Department Information and Computer Science:
Konemiehentie 2,
Room A334 (3rd floor).
|
Research interests
My primary research interests are in algorithms, data mining, and data analysis.
I look at basic research questions in computer science in areas where
there
are applications in sight. This mode of operation
is the basic style of the
Center for Excellence in Algorithmic Data Analysis
The application areas I am interested in included computational biology,
paleontology, and linguistics.
Jefrey Lijffijt, Panagiotis Papapetrou, Kai Puolamaki, Heikki Mannila:
Analyzing Word Frequencies in Large Text Corpora Using Inter-arrival Times and Bootstrapping.
ECML/PKDD (2) 2011: 341-357
Panagiotis Papapetrou, Aristides Gionis, Heikki Mannila:
A Shapley Value Approach for Influence Attribution. ECML/PKDD (2) 2011: 549-564 187
Aleksi Kallio, Niko Vuokko, Markus Ojala, Niina Haiminen, Heikki Mannila:
Randomization techniques for assessing the significance of gene periodicity results.
BMC Bioinformatics 12: 330 (2011)
Gemma C. Garriga, Esa Junttila, Heikki Mannila:
Banded structure in binary matrices. Knowl. Inf. Syst. 28(1): 197-226 (2011)
T. Nevalainen, H. Raumolin-Brunberg and H. Mannila:
The diffusion of language change in real time: Progressive and conservative individuals and the time depth of change
Language Variation and Change 23, 1, 1-43, 2011.
J. Saarinen, E. Oikarinen, M. Fortelius and H. Mannila: The living and the fossilized: how well do unevenly distributed points capture the faunal information in a grid.
Evolutionary Ecology Research, 12: 363–376, 2010.
Theodoros Lappas, Evimaria Terzi, Dimitrios Gunopulos, Heikki Mannila:
Finding effectors in social networks.
KDD 2010: 1059-1068
Panu Luosto, Jyrki Kivinen, Heikki Mannila:
Gaussian Clusters and Noise:
An Approach Based on the Minimum Description Length Principle.
Discovery Science 2010: 251-265
T. Elomaa, H. Mannila, P. Orponen (eds.):
Algorithms and Applications, Essays Dedicated to Esko Ukkonen on the Occasion of His 60th Birthday. ISBN 978-3-642-12475-4, Springer 2010.
T. Vesala, S. Launiainen, P. Kolari, J. Pumpanen,
S. Sevanto, P. Hari, E. Nikinmaa,
P. Kaski, H. Mannila, E. Ukkonen, S. Piao and P. Ciais:
Autumn temperature and carbon balance of a boreal Scots pine forest in Southern Finland.
Biogeosciences 7, 163-176, 2010.
M. Ojala, G. Garriga, A. Gionis, H. Mannila:
Evaluating Query Result Significance in Databases via
Randomizations.
SDM'10: Proceedings of the 2010 SIAM International Conference on Data Mining, p. 906-917.
J. Wessman, T. Paunio, A. Tuulio-Henriksson,
M. Koivisto, T. Partonen, J. Suvisaari, JA. Turunen, J. Wedenoja, W. Hennah,
O. Pietilainen, J. Lonnqvist, H. Mannila, L. Peltonen:
Mixture model clustering of phenotype features reveals evidence
for association of DTNBP1 to a specific subtype of schizophrenia.
Biological Psychiatry, Volume 66, Issue 11, Pages 990-996, 2009.
M. Miah, G. Das, V. Hristidis, H. Mannila:
Determining Attributes to Maximize Visibility of Objects
IEEE Transactions on Knowledge and Data Engineering
21, 7 (2009), 959-973.
H. Hakkoymaz, G. Chatzimilioudis, D. Gunopulos, H. Mannila:
Applying Electromagnetic Field Theory Concepts to Clustering with Constraints.
ECML/PKDD (1) 2009: 485-500.
T. Feder, H. Mannila, E. Terzi:
Approximating the Minimum Chain Completion problem
.
Information Processing Letters, 109, 17, 2009, 980-985.
S. Hanhijärvi, M. Ojala, N. Vuokko, K. Puolamäki, N. Tatti, and H. Mannila:
Tell me something I don't know: Randomization strategies for iterative data mining.
Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '09),
p. 379-388.
L.H. Liow, M. Fortelius, K. Lintulaakso, H. Mannila, N.Chr. Stenseth:
Lower Extinction Risk in Sleep-or-Hide Mammals.
American Naturalist 2009. Vol. 173, pp. 264-272.
A. Ukkonen, K. Puolamäki, A. Gionis, H. Mannila:
A Randomized Approximation Algorithm for Computing Bucket
Orders.
Information Processing Letters 109(7):356-359, 2009.
H. Mannila:
Finding Total and Partial Orders from Data for Seriation,
Discovery Science 2008
p. 16-25.
G. Garriga, A. Ukkonen, H. Mannila:
Feature Selection in Taxonomies with Applications to Paleontology,
Discovery Science 2008
p. 112--123.
[Correction.]
N. Haiminen, H. Mannila, E. Terzi:
Determining significance of pairwise co-occurrences of events in bursty sequences.
BMC Bioinformatics 9(336), 2008.
[online, open access]
P. Miettinen, T. Mielikainen, A. Gionis, G. Das, H. Mannila: The Discrete Basis Problem.
To appear in IEEE Transactions on Knowledge and Data Engineering, 20(10), October 2008.
[PrePrint
from IEEE]
(An expanded versio of
P. Miettinen, T. Mielikainen, A. Gionis, G. Das, H. Mannila:
The Discrete Basis Problem.
10th European Conference on Principles and Practice of Knowledge
Discovery in Databases (PKDD) 2006, p. 335-346.
PKDD Best Paper.
N. Haiminen, H. Mannila:
Evaluation of BIC and cross validation for model selection on sequence segmentations.
International Journal of Data Mining and Bioinformatics (IJDMB) (in press).
L.H. Liow, M. Fortelius, E. Bingham, K. Lintulaakso,
H. Mannila, L. Flynn, and N.Chr. Stenseth
Higher origination and extinction rates in larger mammals.
Proc Natl Acad Sci 105(16), pp. 6097-6102, 2008.
G. Garriga, E. Junttila, H. Mannila:
Banded structure in binary matrices.
Proceedings of the 14th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining (KDD-2008), Las Vegas, Nevada,
United States, August 24-27th, 2008, pages 292-300.
P.E. Lundmark, U. Liljedahl, D.I. Boomsma, H. Mannila,
N.G. Martin, A. Palotie, L. Peltonen, M. Perola, T.D.
Spector and A.-C. Syvanen:
Evaluation of HapMap data in six populations of European descent.
European Journal of Human Genetics 2008, 1-9.
P. Rastas, M. Koivisto, H. Mannila, and E. Ukkonen:
Phasing genotypes using a hidden Markov model.
In: Bioinformatics Algorithms: Techniques and Applications,
I. Mandoiu and A. Zelikovsky (eds.), p. 373-391, Wiley 2008.
P. Miettinen, A. Gallo, H. Mannila:
Finding duplicate descriptors: algorithms for redescription mining.
SIAM Data Mining Conference 2008, p. 334-345.
M. Ojala, N. Vuokko, A. Kallio, N. Haiminen, H. Mannila:
Randomization of real-valued matrices for assessing
the significance of data mining results.
SIAM Data Mining Conference 2008, p. 494-505..
B. Goethals, W. Le Page, and Heikki Mannila,
Mining Association Rules of Simple Conjunctive Queries,
SIAM Data Mining Conference 2008, p. 96-107.
M. Miah, V. Hristidis,
G. Das, H. Mannila:
Standing Out in a Crowd: Selecting Attributes for Maximum Visibility.
International Conference on Data Engineering (ICDE 2008),
p. 356-365.
Robert Gwadera and Aristides Gionis and Heikki Mannila:
Optimal segmentation using tree models,
Knowledge and Information Systems 15, 3 (2008).
A. Gionis, H. Mannila, T. Mielikainen, and P. Tsaparas,
Assessing Data Mining Results via Swap Randomization,
ACM Transactions on Knowledge Discovery from Data (TKDD),
Volume 1 , Issue 3 (December 2007)
Article No. 14.
We consider a simple randomization technique for producing
random datasets that have the same row and column margins
with the given dataset. Then one can test the significance of
a data mining result by computing the results of interest on
the randomized instances and comparing them against the
results on the actual data. This randomization technique
can be used to assess the results of many different types
of data mining algorithms, such as frequent sets, clustering,
and rankings. To generate random datasets with given mar-
gins, we use variations of a Markov chain approach, which
is based on a simple swap operation. We give theoretical
results on the efficiency of different randomization methods,
and apply the swap randomization method to several well-
known datasets. Our results indicate that for some datasets
the structure discovered by the data mining algorithms is
a random artifact, while for other datasets the discovered
structure conveys meaningful information.
The
code
is available.
H. Mannila:
The role of information technology for systems biology.
In
Systems Biology: A Grand Challenge for Europe,
ESF 2007, p. 21-23.
A. Ukkonen and H. Mannila:
Finding Outlying Items in Sets of Partial Rankings
.
In: Knowledge Discovery in Databases: PKDD 2007,
p. 265-276.
S. Hyvonen, A. Gionis, and H. Mannila:
Recurrent predictive models for sequence segmentation.
Advances in Intelligent Data Analysis VII
(IDA 2007), p. 195-206.
H. Mannila and E. Terzi:
Nestedness and segmented nestedness.
In Proceedings of the 13th ACM SIGKDD international conference on
Knowledge discovery and data mining (KDD 2007), p. 480-489.
H. Heikinheimo, E. Hinkkanen, H. Mannila, T. Mielikäinen,
and J. Seppänen,
Finding low-entropy
ssets and trees from binary data
In Proceedings of the 13th ACM SIGKDD international conference on
Knowledge discovery and data mining (KDD 2007), p. 350-359.
Niina Haiminen, Heikki Mannila, Evimaria Terzi:
Comparing segmentations by applying randomization
techniques.
BMC Bioinformatics 2007, 8:171 (23 May 2007).
N. Haiminen, H. Mannila: Discovering isochores by least-squares optimal segmentation.
Gene 394 (Issues 1-2), 2007, pp. 53-60
(1 June 2007).
[online via ScienceDirect]
N. Landwehr, T. Mielikäinen, L. Eronen, H. Toivonen and H.
Mannila,
Constrained hidden Markov models for population-based haplotyping,
BMC Bioinformatics 2007, 8(Suppl 2):S9.
A. Dasgupta, G. Das, and H. Mannila:
A Random Walk Approach to Sampling Hidden Databases
.
Proceedings of the 2007 ACM SIGMOD international conference on
Management of Data (SIGMOD 2007), p. 629-640.
A. Hinneburg, H. Mannila, S. Kaislaniemi, T. Nevalainen and
H. Raumolin-Brunberg:
How to Handle Small Samples: Bootstrap and Bayesian Methods in the
Analysis of Linguistic Change,
Literary and
Linguistic Computing 22, 2 (June 2007) 137-150; doi: 10.1093/llc/fqm006
H. Heikinheimo, M. Fortelius, J. Eronen and H. Mannila:
Biogeography of European land
mammals shows environmentally distinct and spatially coherent clusters.
Journal of Biogeography 34, 6, 1053-1064 (2007).
doi:10.1111/j.1365-2699.2006.01664.x
A. Gionis, H. Mannila, P. Tsaparas: Clustering
Aggregation (long version)
ACM Transactions on Knowledge Discovery from Data, 1, 1 (2007),
The
code
is available.
R. Gwadera, A. Gionis, and H. Mannila,
Optimal Segmentation using Tree Models.
2006 IEEE International Conference on Data Mining, p. 244-253, 2006
N. Tatti, T. Mielikainen, A. Gionis, and H. Mannila,
What is the dimension of your binary data?
2006 IEEE International Conference on Data Mining, p. 603-612, 2006.
H. Heikinheimo, H. Mannila, J. Seppänen:
Finding Trees from Unordered 0-1 Data.
10th European Conference on Principles and Practice of Knowledge
Discovery in Databases (PKDD) 2006, p. 175-186.
A. Gionis, H. Mannila, K. Puolamaki, and A. Ukkonen,
Algorithms for Discovering Bucket Orders from Data,
12th International Conference on
Knowledge Discovery and Data Mining (KDD) 2006, p. 561-566.
We consider bucket orders, i.e., total orders with ties.
They can be used to capture the essential order information
without overfitting the data: they form a useful concept
class between total orders and arbitrary partial orders.
We address the question of finding a bucket order for a set
of items, given pairwise precedence information between the
items. We also discuss methods for computing the pairwise
precedence data.
We describe simple and efficient algorithms for finding
good bucket orders. Several of the algorithms have a provable
approximation guarantee, and they scale well to large
datasets. We provide experimental results on artificial and
a real data that show the usefulness of bucket orders and
demonstrate the accuracy and efficiency of the algorithms.
N. Landwehr, T. Mielikainen, L. Eronen, H. Toivonen, and
H. Mannila:
Constrained Hidden Markov Models
for Population-based Haplotyping,
PMSB 2006, to appear.
K. Puolamäki, M. Fortelius, H. Mannila:
Seriation in Paleontological Data Using Markov Chain Monte Carlo
Methods.
PLoS Comput Biol 2(2): e6
This paper looks at the seriation problem in paleontology.
Given a collection of fossil sites, a set of taxa, and the presence/absence
information for all taxa, find a good ordering for the sites.
We describe a probabilistic model for
the seriation problem, and show how MCMC techniques can
be used to obtain estimates for the ordering of the sites,
taxon lifetimes, etc.
Compared to the spectral method described in another paper,
the MCMC method gives better estimates of the uncertainty in the results,
but is much slower.
The
code
for the methods is available.
Jean-Francois Boulicaut, Luc de Raedt, Heikki Mannila (eds.):
Constraint-based mining and inductive databases.
Springer-Verlag LNCS Volume 3848,
ISBN: 3-540-31331-1,
Springer 2005.
A collection of papers on constraints in pattern discovery
and on the related concept of inductive databases.
J. Seppanen, H. Mannila:
Boolean formulas and frequent sets.
In Jean-Francois Boulicaut, Luc de Raedt, Heikki Mannila (eds.):
Constraint-based mining and inductive databases,
Springer-Verlag LNCS Volume 3848,
ISBN: 3-540-31331-1, Springer 2005, p. 348-361.
We consider the problem of approximation the frequency of a query,
given a collection of frequent itemsets. We study the
algorithm that truncates the inclusion-exclusion sum to include only the
frequencies of known itemsets, give a bound for its performance on disjunctions
of attributes that is smaller than the previously known bound,
and show that this bound is in fact achievable. We also show how to
generalize the algorithm to approximate arbitrary Boolean queries.
E. Bingham, A. Gionis, N. Haiminen, H. Hiisila, H. Mannila, E. Terzi: Segmentation and Dimensionality Reduction,
SIAM Data Mining Conference (SDM) 2006.
Sequence segmentation and dimensionality reduction have
been used as methods for studying high-dimensional sequences:
they both reduce the complexity of the representation
of the original data. In this paper we study the
interplay of these two techniques. We formulate the problem
of segmenting a sequence while modeling it with a basis
of small size, thus essentially reducing the dimension of the
input sequence. We give three di
erent algorithms for this
problem: all combine existing methods for sequence segmentation
and dimensionality reduction. For two of the proposed
algorithms we prove guarantees for the quality of the solutions
obtained. We describe experimental results on synthetic
and real datasets, including data on exchange rates
and genomic sequences. Our experiments show that the algorithms
indeed discover underlying structure in the data,
including both segmental structure and interdependencies
between the dimensions.
The
code
for the methods is available.
Polish translation of
D. Hand, H. Mannila and P. Smyth:
Principles of Data Mining
available: "
Eksploracja danych",
Wydawnictwa Naukowo-Techniczne,
ISBN 83-204-3053-4, 2005.
F. Afrati, G. Das, A. Gionis, H. Mannila,
T. Mielikäinen, P. Tsaparas:
Mining chains of relations.
ICDM 2005, the Fifth IEEE International Conference on Data Mining, p. 553-556.
S. Papadimitriou, A. Gionis, P. Tsaparas,
R.A. Vaisanen, H. Mannila C. Faloutsos:
Parameter-Free Spatial Data Mining Using MDL.
ICDM 2005, the Fifth IEEE International Conference on Data Mining, p. 346-353.
M. Fortelius, A. Gionis, J. Jernvall, H. Mannila, Spectral Ordering
and Biochronology of European Fossil Mammals,
Paleobiology 32, 2, 206-214.
This paper looks at the seriation problem in paleontology.
Given a collection of fossil sites, a set of taxa, and the presence/absence
information for all taxa, find a good ordering for the sites.
The biological background knowledge that is used is that
the species become extant, live for a certain period, and then become
extinct; i.e., in error-free data the correct ordering is characterized
as the ordering giving the consecutive ones property for the matrix.
Real data, however, has lots of noise, and finding the optimal ordering
is a hard problem.
We show that spectral methods give very good results.
Basically, one constructs a similarity matrix for the sites, computes
the Laplacian, and uses one of the eigenvectors as the ordering criterion.
The
code
is available.
P. Rastas, M. Koivisto, H. Mannila, and E. Ukkonen:
A hidden Markov technique for haplotype reconstruction.
In: R. Casadio and G. Myers (eds.),
Algorithms in Bioinformatics: 5th International Workshop, WABI 2005,
Lecture Notes in Computer Science, 3692, pp. 140-151,
Springer, 2005.
S. Hyvönen, H. Junninen, L. Laakso, M. Dal Maso, T. Grönholm, B. Bonn,
P. Keronen, P. Aalto, V. Hiltunen, T. Pohja, S. Launiainen, P. Hari, H.
Mannila, M. Kulmala:
A look at aerosol formation using data mining
techniques,
Atmos. Chem. Phys., 5, 3345-3356, 2005.
A. Ukkonen, M. Fortelius, H. Mannila:
Finding partial orders from unordered 0-1 data.
In R. Grossman, R. Bayardo, K. P. Bennett (Eds.): Proceedings
of the Eleventh ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, p. 285-293.
A. Gionis, H. Mannila, P.Tsaparas,
Clustering aggregation,
In 21st International Conference on Data Engineering (ICDE) 2005.
p. 341-352.
The
code
is available.
M. Salmenkivi, H. Mannila: Piecewise Constant Modeling of
Sequential Data Using Reversible Jump Markov Chain Monte Carlo.
In J. Wang, M. Zaki, H. Toivonen, D. Shasha (Eds.):
Data Mining in Bioinformatics. Springer 2005, p. 85-103
M. Salmenkivi, H. Mannila: Using Markov chain
Monte Carlo and dynamic programming for event sequence data.
Knowl. Inf. Syst. 7(3): 267-288 (2005)
A. Patrikainen, H. Mannila:
Subspace clustering of high-dimensional binary data -
A probabilistic approach.
Workshop on Clustering High-Dimensional Data and Its Applications,
SIAM International Conference on Data Mining 2004, pp. 57-65.
Mikko Koivisto, Teemu Kivioja, Pasi Rastas, Heikki Mannila, and Esko
Ukkonen:
Hidden Markov modelling techniques for haplotype
analysis.
In: S. Ben-David, J. Case, and A. Maruoka (eds.),
Algorithmic Learning Theory: 15th International Conference, ALT 2004,
Lecture Notes in Computer
Science, 3244, pp. 37-52, Springer, 2004.
F. Geerts, H. Mannila, E. Terzi:
Relational link-based ranking .
The 30th International Conference on Very Large Data Bases (VLDB'04)
, 2004, p.
552-563.
J. Seppänen, H. Mannila,
Dense itemsets.
In W. Kim, R. Kohavi, J. Gehrke, W. DuMouchel (Eds.):
Proceedings of the Tenth ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining (KDD 2004),
p. 683-688.
A. Gionis, H. Mannila, E. Terzi,
Clustered segmentations,
3rd Workshop on Mining Temporal and Sequential Data (TDM) 2004
A. Gionis, H. Mannila, J. Seppänen,
Geometric and combinatorial tiles in 0-1 data,
8th European Conference on Principles and Practice of Knowledge
Discovery in Databases (PKDD) 2004,
p. 173-184.
F. Afrati, A. Gionis, H. Mannila,
Approximating a collection of frequent sets,
10th International Conference on
Knowledge Discovery and Data Mining (KDD 2004),
p. 12-19.
Somewhat older papers
Dmitry Pavlov, H. Mannila, P. Smyth:
Beyond independence: probabilistic methods for
query approximation on binary transaction data.
IEEE Trans. Knowl. Data Eng. 15(6): 1409-1421 (2003)
Dimitrios Gunopulos, Roni Khardon, Heikki Mannila,
Sanjeev Saluja, Hannu Toivonen, and Ram Sewak Sharma.
Discovering all most specific sentences.
ACM Transactions on Database
Systems
28 (2): 140 - 174, June 2003.
(DOI:
http://doi.acm.org/10.1145/777943.777945)
Slides of ICDM 2003 invited talk:
Global structure from sequences
A. Gionis, T. Kujala and H. Mannila:
Fragments of order.
ACM SIGKDD 2003, p. 129-136.
A. Leino, H. Mannila and R.-L. Pitkanen:
Rule discovery and probabilistic modeling for onomastic data.
PKDD 2003, p. 291-302.
T. Mielikainen and H. Mannila:
The Pattern
Ordering Problem.
PKDD 2003, p. 327-338.
J. Seppanen, E. Bingham and H. Mannila:
A simple algorithm for topic identification in 0-1 data.
PKDD 2003, p. 423-434.
A. Gionis and H. Mannila:
Finding recurrent sources in sequences.
ACM ReCOMB 2003, p. 123-130.
The
code
is available.
Y. Zhu, J. Hollmen, R. Raty, Y. Aalto, B. Nagy,
E. Elonen, J. Kere, H. Mannila, K. Franssila, S. Knuutila:
Investigatory and analytical approaches to
differential gene expression profiling in mantle cell lymphoma.
Br J Haematol.
2002 Dec;119(4):905-15.
T. Niini, K. Vettenranta, J. Hollmen, M.L. Larramendy,
Y. Aalto, H. Wikman, B. Nagy, J.K. Seppanen, A.F. Salvador,
H. Mannila, U.M. Saarinen-Pihkala, S. Knuutila:
Expression of myeloid-specific genes in
childhood acute lumpoblastic leukemia -- a cDNA array study.
Leukemia 16, 2213-2221, 2002.
Luc de Raedt, Manfred Jaeger, Sau Dan Lee, Heikki Mannila:
A theory of
inductive query answering.
Proceedings of the 2nd IEEE International Conference on Data
Mining
Vipin Kumar, Shusaku Tsumoto, Ning Zhong, Philip S. Yu, Xindong Wu
(Eds.), pp. 123-130, 2002.
J. Han, R.B. Altman, V. Kumar, H. Mannila, D. Pregibon
Emerging Scientific Applications in Data Mining
Communications of the ACM 45, 8 (August 2002), 54-58.
M. Salmenkivi, J. Kere, H. Mannila:
Genome Segmentation using Piecewise Constant Intensity Models and
Reversible Jump MCMC.
(European Computational Biology Conference 2002.)
Bioinformatics
18, Supplement 2, S211-S218.
P. Onkamo, V. Ollikainen, P. Sevon, HTT. Toivonen, H.
Mannila, and J. Kere: Association analysis for quantitative traits by
data mining: QHPM.
The Annals of Human
Genetics 66 (2002), 419-429.
Machine Learning: ECML 2002 -
12th European Conference on Machine Learning, LNCS 2430,
T. Elomaa, H. Mannila, H. Toivonen (Eds.).
Springer 2002.
Principles of Data Mining and Knowledge Discovery -
6th European Conference, PKDD 2002, LNCS 2431,
T. Elomaa, H. Mannila, H. Toivonen (Eds.).
Springer 2002.
E. Bingham, H. Mannila and J. Seppänen:
Topics in 0-1 data.
To appear in KDD 2002.
H. Mannila:
Global and local methods in data mining: basic techniques and open
problems.
ICALP 2002, 29th International Colloquium on Automata, Languages,
and Programming, Malaga, Spain, July 2002; (c)
Springer-Verlag
C.K. Leung, R. Ng, and H. Mannila:
OSSM: A Segmentation Approach to Optimize Frequency Counting.
ICDE 2002.
H. Mannila, A. Patrikainen, J. Seppänen, and J. Kere:
Long-range control of expression in yeast.
Bioinformatics
18, 3 (2002), 482-483.
B. Bollobas, G. Das, D. Gunopulos
and H. Mannila:
Time-Series Similarity Problems and Well-Separated Geometric Sets.
Nordic Journal on Computing, 2001. Shorter version in
13th Annual ACM Symposium on Computational
Geometry, 1997,
p. 454-456.
Principles of Data Mining , David Hand, Heikki Mannila, and
Padhraic
Smyth, MIT Press, August 2001.
New links to older papers
Here are links to some papers that previously were unlinked
in the full list of publications.
E. Bingham and H. Mannila:
Random projection in dimensionality reduction: applications to image and
text
data.
Proceedings of the Seventh ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (KDD 2001), F.
Provost and R. Srikant (eds.),
p. 245-250.
H. Mannila and C. Meek:
Global partial orders
from sequential data.
Sixth Annual Conference on
Knowledge Discovery and Data Mining (KDD-2000), p. 161-168.
G. Das and H. Mannila:
Context-based similarity methods for categorical attributes.
Principles of Data Mining and Knowledge
Discovery, 4th European Conference
(PKDD 2000)
D.A. Zighed et al. (eds.), p. 201-211.
H. Mannila and D. Rusakov:
Decomposing event sequences into independent components.
First SIAM Conference on Data
Mining, 2001.
H. Mannila and J. Seppänen:
Recognizing similar situations from event sequences.
First SIAM Conference on Data
Mining, 2001.