---
_id: '1116'
abstract:
- lang: eng
  text: This paper presents a system that uses the domain name of a German business
    website to locate its information pages (e.g. company profile, contact page, imprint)
    and then identifies business specific information. We therefore concentrate on
    the extraction of characteristic vocabulary like company names, addresses, contact
    details, CEOs, etc. Above all, we interpret the HTML structure of documents and
    analyze some contextual facts to transform the unstructured web pages into structured
    forms. Our approach is quite robust in variability of the DOM, upgradeable and
    keeps data up-to-date. The evaluation experiments show high efficiency of information
    access to the generated data. Hence, the developed technique is adaptive to non-German
    websites with slight language-specific modifications, and experimental results
    on real-life websites confirm the feasibility of the approach.
author:
- first_name: Yeong Su
  full_name: Lee, Yeong Su
  last_name: Lee
- first_name: Michaela
  full_name: Geierhos, Michaela
  id: '42496'
  last_name: Geierhos
  orcid: 0000-0002-8180-5606
citation:
  ama: 'Lee YS, Geierhos M. Business Specific Online Information Extraction from German
    Websites. In: Gelbukh A, ed. <i>Computational Linguistics and Intelligent Text
    Processing: 5th International Conference, CICLing 2004, Seoul, Korea, February
    15-21, 2004, Proceedings</i>. Vol 5449. Lecture Notes in Computer Science. Berlin,
    Germany: Springer; 2009:369-381. doi:<a href="https://doi.org/10.1007/978-3-642-00382-0_30">10.1007/978-3-642-00382-0_30</a>'
  apa: 'Lee, Y. S., &#38; Geierhos, M. (2009). Business Specific Online Information
    Extraction from German Websites. In A. Gelbukh (Ed.), <i>Computational Linguistics
    and Intelligent Text Processing: 5th International Conference, CICLing 2004, Seoul,
    Korea, February 15-21, 2004, Proceedings</i> (Vol. 5449, pp. 369–381). Berlin,
    Germany: Springer. <a href="https://doi.org/10.1007/978-3-642-00382-0_30">https://doi.org/10.1007/978-3-642-00382-0_30</a>'
  bibtex: '@inbook{Lee_Geierhos_2009, place={Berlin, Germany}, series={Lecture Notes
    in Computer Science}, title={Business Specific Online Information Extraction from
    German Websites}, volume={5449}, DOI={<a href="https://doi.org/10.1007/978-3-642-00382-0_30">10.1007/978-3-642-00382-0_30</a>},
    booktitle={Computational Linguistics and Intelligent Text Processing: 5th International
    Conference, CICLing 2004, Seoul, Korea, February 15-21, 2004, Proceedings}, publisher={Springer},
    author={Lee, Yeong Su and Geierhos, Michaela}, editor={Gelbukh, AlexanderEditor},
    year={2009}, pages={369–381}, collection={Lecture Notes in Computer Science} }'
  chicago: 'Lee, Yeong Su, and Michaela Geierhos. “Business Specific Online Information
    Extraction from German Websites.” In <i>Computational Linguistics and Intelligent
    Text Processing: 5th International Conference, CICLing 2004, Seoul, Korea, February
    15-21, 2004, Proceedings</i>, edited by Alexander Gelbukh, 5449:369–81. Lecture
    Notes in Computer Science. Berlin, Germany: Springer, 2009. <a href="https://doi.org/10.1007/978-3-642-00382-0_30">https://doi.org/10.1007/978-3-642-00382-0_30</a>.'
  ieee: 'Y. S. Lee and M. Geierhos, “Business Specific Online Information Extraction
    from German Websites,” in <i>Computational Linguistics and Intelligent Text Processing:
    5th International Conference, CICLing 2004, Seoul, Korea, February 15-21, 2004,
    Proceedings</i>, vol. 5449, A. Gelbukh, Ed. Berlin, Germany: Springer, 2009, pp.
    369–381.'
  mla: 'Lee, Yeong Su, and Michaela Geierhos. “Business Specific Online Information
    Extraction from German Websites.” <i>Computational Linguistics and Intelligent
    Text Processing: 5th International Conference, CICLing 2004, Seoul, Korea, February
    15-21, 2004, Proceedings</i>, edited by Alexander Gelbukh, vol. 5449, Springer,
    2009, pp. 369–81, doi:<a href="https://doi.org/10.1007/978-3-642-00382-0_30">10.1007/978-3-642-00382-0_30</a>.'
  short: 'Y.S. Lee, M. Geierhos, in: A. Gelbukh (Ed.), Computational Linguistics and
    Intelligent Text Processing: 5th International Conference, CICLing 2004, Seoul,
    Korea, February 15-21, 2004, Proceedings, Springer, Berlin, Germany, 2009, pp.
    369–381.'
conference:
  end_date: 2009-03-07
  location: Mexico City, Mexico
  name: 10th International Conference on Computational Linguistics and Intelligent
    Text Processing (CICLing 2009)
  start_date: 2009-03-01
date_created: 2018-01-29T14:06:15Z
date_updated: 2022-01-06T06:50:57Z
department:
- _id: '36'
- _id: '1'
- _id: '579'
doi: 10.1007/978-3-642-00382-0_30
editor:
- first_name: Alexander
  full_name: Gelbukh, Alexander
  last_name: Gelbukh
extern: '1'
intvolume: '      5449'
language:
- iso: eng
page: 369-381
place: Berlin, Germany
publication: 'Computational Linguistics and Intelligent Text Processing: 5th International
  Conference, CICLing 2004, Seoul, Korea, February 15-21, 2004, Proceedings'
publication_identifier:
  eisbn:
  - 978-3-642-00382-0
  isbn:
  - 978-3-642-00381-3
publication_status: published
publisher: Springer
quality_controlled: '1'
series_title: Lecture Notes in Computer Science
status: public
title: Business Specific Online Information Extraction from German Websites
type: book_chapter
user_id: '42496'
volume: 5449
year: '2009'
...
