{"id":37,"date":"2024-02-01T17:50:44","date_gmt":"2024-02-01T17:50:44","guid":{"rendered":"https:\/\/citestu17.savecicadabuzz.org\/?page_id=37"},"modified":"2024-04-25T16:56:52","modified_gmt":"2024-04-25T16:56:52","slug":"week-2-call","status":"publish","type":"page","link":"https:\/\/citestu17.savecicadabuzz.org\/index.php\/week-2-call\/","title":{"rendered":"Call"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-page\" data-elementor-id=\"37\" class=\"elementor elementor-37\">\n\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-1203580 e-flex e-con-boxed e-con e-parent\" data-id=\"1203580\" data-element_type=\"container\" data-settings=\"{&quot;content_width&quot;:&quot;boxed&quot;}\" data-core-v316-plus=\"true\">\n\t\t\t\t\t<div class=\"e-con-inner\">\n\t\t\t\t<div class=\"elementor-element elementor-element-ee6c97b elementor-widget elementor-widget-heading\" data-id=\"ee6c97b\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t<style>\/*! elementor - v3.18.0 - 20-12-2023 *\/\n.elementor-heading-title{padding:0;margin:0;line-height:1}.elementor-widget-heading .elementor-heading-title[class*=elementor-size-]>a{color:inherit;font-size:inherit;line-height:inherit}.elementor-widget-heading .elementor-heading-title.elementor-size-small{font-size:15px}.elementor-widget-heading .elementor-heading-title.elementor-size-medium{font-size:19px}.elementor-widget-heading .elementor-heading-title.elementor-size-large{font-size:29px}.elementor-widget-heading .elementor-heading-title.elementor-size-xl{font-size:39px}.elementor-widget-heading .elementor-heading-title.elementor-size-xxl{font-size:59px}<\/style><h2 class=\"elementor-heading-title elementor-size-xxl\"><span style=\"font-size: 32px; white-space-collapse: collapse;\">Is white space tokenization enough?<\/span><\/h2>\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-c21bf9b e-flex e-con-boxed e-con e-parent\" data-id=\"c21bf9b\" data-element_type=\"container\" data-settings=\"{&quot;content_width&quot;:&quot;boxed&quot;}\" data-core-v316-plus=\"true\">\n\t\t\t\t\t<div class=\"e-con-inner\">\n\t\t\t\t<div class=\"elementor-element elementor-element-c90e566 elementor-widget elementor-widget-text-editor\" data-id=\"c90e566\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t<style>\/*! elementor - v3.18.0 - 20-12-2023 *\/\n.elementor-widget-text-editor.elementor-drop-cap-view-stacked .elementor-drop-cap{background-color:#69727d;color:#fff}.elementor-widget-text-editor.elementor-drop-cap-view-framed .elementor-drop-cap{color:#69727d;border:3px solid;background-color:transparent}.elementor-widget-text-editor:not(.elementor-drop-cap-view-default) .elementor-drop-cap{margin-top:8px}.elementor-widget-text-editor:not(.elementor-drop-cap-view-default) .elementor-drop-cap-letter{width:1em;height:1em}.elementor-widget-text-editor .elementor-drop-cap{float:left;text-align:center;line-height:1;font-size:50px}.elementor-widget-text-editor .elementor-drop-cap-letter{display:inline-block}<\/style>\t\t\t\t<p><span style=\"color: #000000; font-family: Arial; font-size: 14.6667px; white-space-collapse: preserve;\">Let&#8217;s analyze the tokenization tool&#8230;&#8230;&#8230;<\/span><\/p><p>When adding the following sentence&#8230; &#8220;My name is Mckinzie Dotson and I am a freshman at Mount Saint Joseph University where I play soccer. But recently I haven&#8217;t been able to play soccer&#8221; to the tokenization tool we can observe many things.\u00a0<\/p><p>When asked if the spaces are sufficient to tokenize this\u00a0 English language text, the answer is no. This is because things like contractions can not be tokenized. For example the word &#8220;haven&#8217;t&#8221; in my previous sentence the tool could not figure out what to do with the word. In one case it was tokened as &#8220;have&#8221;, &#8220;n&#8221;, &#8221; &#8216; &#8220;, &#8220;t&#8221;.\u00a0<\/p><p>\u00a0<\/p>\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Is white space tokenization enough? Let&#8217;s analyze the tokenization tool&#8230;&#8230;&#8230; When adding the following sentence&#8230; &#8220;My name is Mckinzie Dotson [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"site-sidebar-layout":"no-sidebar","site-content-layout":"","ast-site-content-layout":"full-width-container","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"disabled","ast-breadcrumbs-content":"","ast-featured-img":"disabled","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"footnotes":""},"class_list":["post-37","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/citestu17.savecicadabuzz.org\/index.php\/wp-json\/wp\/v2\/pages\/37","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/citestu17.savecicadabuzz.org\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/citestu17.savecicadabuzz.org\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/citestu17.savecicadabuzz.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/citestu17.savecicadabuzz.org\/index.php\/wp-json\/wp\/v2\/comments?post=37"}],"version-history":[{"count":11,"href":"https:\/\/citestu17.savecicadabuzz.org\/index.php\/wp-json\/wp\/v2\/pages\/37\/revisions"}],"predecessor-version":[{"id":194,"href":"https:\/\/citestu17.savecicadabuzz.org\/index.php\/wp-json\/wp\/v2\/pages\/37\/revisions\/194"}],"wp:attachment":[{"href":"https:\/\/citestu17.savecicadabuzz.org\/index.php\/wp-json\/wp\/v2\/media?parent=37"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}