Computer Engineering / Bilgisayar Mühendisliği

Permanent URI for this collectionhttps://hdl.handle.net/11147/10

Browse

Search Results

Now showing 1 - 2 of 2
  • Conference Object
    Citation - Scopus: 6
    Geodesic Distances for Web Document Clustering
    (Institute of Electrical and Electronics Engineers Inc., 2011) Tekir, Selma; Mansmann, Florian; Keim, Daniel
    While traditional distance measures are often capable of properly describing similarity between objects, in some application areas there is still potential to fine-tune these measures with additional information provided in the data sets. In this work we combine such traditional distance measures for document analysis with link information between documents to improve clustering results. In particular, we test the effectiveness of geodesic distances as similarity measures under the space assumption of spherical geometry in a 0-sphere. Our proposed distance measure is thus a combination of the cosine distance of the term-document matrix and some curvature values in the geodesic distance formula. To estimate these curvature values, we calculate clustering coefficient values for every document from the link graph of the data set and increase their distinctiveness by means of a heuristic as these clustering coefficient values are rough estimates of the curvatures. To evaluate our work, we perform clustering tests with the k-means algorithm on the English Wikipedia hyperlinked data set with both traditional cosine distance and our proposed geodesic distance. The effectiveness of our approach is measured by computing micro-precision values of the clusters based on the provided categorical information of each article. © 2011 IEEE.
  • Conference Object
    Citation - WoS: 1
    Citation - Scopus: 1
    Performance Evaluation of Group Communication Architectures in Large Scale Systems Using Mpi
    (Springer Verlag, 2006) Erciyeş, Kayhan; Dağdeviren, Orhan; Payli, Reşat Ümit
    Group communication is an important paradigm for fault tolerance in large scale systems. We describe various group architectures as pipelined, hierarchical, daisy and hypercube groups each consisting of separate clusters, investigate the theoretical performance bounds of these architectures and evaluate their experimental performances using MPI group communication primitives. We first derive time bounds for multicast message deliveries in these architectures and then provide tests to measure the times taken for the same operation. The multicast message delivery times are tested against the number of clusters within a group and the size of the multicast message. We conclude that daisy architecture is favorable both in terms of delivery times and message sizes theoretically and experimentally.