Support community for TTG plugins and products.
NOTICE
The Turning Gate's Community has moved to a new home, at https://discourse.theturninggate.net.
This forum is now closed, and exists here as a read-only archive.
You are not logged in.
Pages: 1
After the update from CE3 to CE4, the website is now running since a few days. One problem has come up:
In CE3, building sitemaps and image sitemaps has been no problem at all. In CE4, the sitemaps are almost empty.
The reason seems to be related to wrong canonical tags:
In galleries.php, line 50 reads
<link rel="canonical" href="https://www.haraldjoergens.com:443/" />
when it should be
<link rel="canonical" href="https://www.haraldjoergens.com/galleries.php" />
The PHP code line in the exported "galleries.php" is
<link rel="canonical" href="<?php echo currentPageLocation(); ?>/" />
Does anyone have an idea what's going wrong or, more importantly, how to fix it?
Thanks!
Harald
Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
Just looking at the code in the exported php file, the comment for the function currentPageLocation shows that the function "Gets current page parent location". Which, for those pages in the root of the site, will yield the site url.
I don't know where the :443 is coming from in yours. Mine (from the galleries.php page or any of the pages generated by CE4 Pages) looks like this:
<link rel="canonical" href="http://rodbarbee.com/" />if I come to my site by going to http://rodbarbee.com 
or this:
<link rel="canonical" href="http://barbeephoto.com/" />If I go to my site using http://barbeephoto.com
(I redirect the rodbarbee.com url to barbeephoto.com)
So it doesn't look like this is coming from TTG, rather, it's coming from the server.
that 443 is in several places in your source code and my guess is that it's coming from your host.
Matt or Ben will have to comment about any "fix", if there is one.
Rod  
Just a user with way too much time on his hands.
www.rodbarbee.com
ttg-tips.com, Backlight 2/3 test site
Offline
Just looking at the code in the exported php file, the comment for the function currentPageLocation shows that the function "Gets current page parent location". Which, for those pages in the root of the site, will yield the site url.
I don't know where the :443 is coming from in yours. Mine (from the galleries.php page or any of the pages generated by CE4 Pages) looks like this:
<link rel="canonical" href="http://rodbarbee.com/" />if I come to my site by going to http://rodbarbee.com
or this:<link rel="canonical" href="http://barbeephoto.com/" />If I go to my site using http://barbeephoto.com
(I redirect the rodbarbee.com url to barbeephoto.com)So it doesn't look like this is coming from TTG, rather, it's coming from the server.
that 443 is in several places in your source code and my guess is that it's coming from your host.
Matt or Ben will have to comment about any "fix", if there is one.
Hi Rob,
Thanks for having a look. The 443 is the port for HTTPS, so that's correct, and comes indeed from the server.
On most pages all seems to be correct, I picked a random page and find
<link rel="canonical" href="https://www.haraldjoergens.com:443/galleries/rowing/the-boat-race/" />
which is correct.
The problem seems to be specific to galleries.php, where the canonical tag refers to the root, instead of galleries.php.
Harald
Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
The problem seems to be specific to galleries.php, where the canonical tag refers to the root, instead of galleries.php
I think that's what's supposed to happen. All pages in the root of the site will give the same result. The code gives the parent location of the page. 
If you were to be looking at a gallery's index page the the canonical url would be something like yoursite.com/galleries/some-index/actual-gallery/
So it shows the folder, not the file.
Rod  
Just a user with way too much time on his hands.
www.rodbarbee.com
ttg-tips.com, Backlight 2/3 test site
Offline
The problem seems to be specific to galleries.php, where the canonical tag refers to the root, instead of galleries.php
I think that's what's supposed to happen. All pages in the root of the site will give the same result. The code gives the parent location of the page.
If you were to be looking at a gallery's index page the the canonical url would be something like yoursite.com/galleries/some-index/actual-gallery/
So it shows the folder, not the file.
Hi Rob,
Please have look at this Google information. The canonical should not point to the folder.
I have hardcoded the correct information in galleries.php, and the sitemap generation gets a bit further but is then stuck again, so I guess that problem isn't limited to galleries.php.
Cheers
Harald
Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
I don't write the code so this is something Matt will have to comment on.
Rod  
Just a user with way too much time on his hands.
www.rodbarbee.com
ttg-tips.com, Backlight 2/3 test site
Offline
Hi Harald, can you post the contents of the currentPage functions as found in the template?
Offline
Hi Harald, can you post the contents of the currentPage functions as found in the template?
Hi Ben,
Here is the code from galleries.php. It's quite different to the very short function from CE3.
Thanks!
Harald
if (!function_exists('currentPageLocation')) {
        // Gets current page parent location
        function currentPageLocation() {
            $currentPageURL = currentPageURL();
            $returnURL = '';
            $finalSlash = strrpos(currentPageURL(), '/');
            if (strrpos($currentPageURL, '.') > strrpos($currentPageURL, '/')) // has a file after the final slash, e.g. http://url/directory/index.php
                $returnURL = substr($currentPageURL, 0, $finalSlash);
            else if (strrpos($currentPageURL, '?') > strrpos($currentPageURL, '/')) // has a ? after the final slash, e.g. http://url/directory/index.php
                $returnURL = substr($currentPageURL, 0, $finalSlash);
            else if ($finalSlash == strlen($currentPageURL)-1) // final character is a slash, e.g. http://url/directory/
                $returnURL = substr($currentPageURL, 0, $finalSlash);
            else // final character is not a slash, e.g. http://url/directory
                $returnURL = $currentPageURL;
            return $returnURL;
        }
    Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
Hi Harald, can you also post currentPageURL?
Offline
Ben, here's the currentPageURL() function
if (!function_exists('currentPageURL')) {
        // Gets current page URL
        function currentPageURL() {
            $pageURL = 'http';
            if (!empty($_SERVER['HTTPS']) && $_SERVER['HTTPS'] == 'on') {$pageURL .= "s";}
            $pageURL .= "://";
            if ($_SERVER["SERVER_PORT"] != "80" && strpos($host, ':') === false) {
                $pageURL .= $_SERVER["SERVER_NAME"].":".$_SERVER["SERVER_PORT"].$_SERVER["REQUEST_URI"];
            } else {
                $pageURL .= $_SERVER["SERVER_NAME"].$_SERVER["REQUEST_URI"];
            }
            return $pageURL;
        }
    }Rod  
Just a user with way too much time on his hands.
www.rodbarbee.com
ttg-tips.com, Backlight 2/3 test site
Offline
Thanks Rod. It looks like there may be a logic error in the code. Harald, can you try changing:
=== falseto:
!== false?
Offline
Thanks Rod. It looks like there may be a logic error in the code. Harald, can you try changing:
=== falseto:
!== false?
Hi Ben,
Thanks! The change has a nasty side effect on other lines, and it turns this:
         
      
      <meta property="og:image" content="https://www.haraldjoergens.com:443/photos/????????.DNG.jpg" />
      <meta property="og:site_name" content="<!--TitleToReplace-->" />into an "unexpected <" PHP error in the second line. I guess you are right about the problem, but just changing the "equals" to "not equals" has implications.
Harald
Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
Hi Harald, that's strange. Does this line:
if ($_SERVER["SERVER_PORT"] != "80" && strpos($host, ':') === false) {Now look exactly like this line?
if ($_SERVER["SERVER_PORT"] != "80" && strpos($host, ':') !== false) {Offline
Hi Harald, that's strange. Does this line:
if ($_SERVER["SERVER_PORT"] != "80" && strpos($host, ':') === false) {Now look exactly like this line?
if ($_SERVER["SERVER_PORT"] != "80" && strpos($host, ':') !== false) {
Hi Ben,
Yes, it does. I tried it twice, only changing the "===" to "!==" instead of copying and pasting the line, and the result was the same. 
Looking at the code, I find it strange too, but I don't want to fiddle too much with a live website.
Of course I don't have a certificate on my test subdomain or localhost, to I can't really test HTTPS in a test environment.
Cheers
Harald
Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
Hi Ben,
As another data point, I made the same change to my galleries.php file and I'm seeing no difference at all in the output page source code. I'm not getting the same error that Harald is seeing.
Rod  
Just a user with way too much time on his hands.
www.rodbarbee.com
ttg-tips.com, Backlight 2/3 test site
Offline
Hi Ben,
As another data point, I made the same change to my galleries.php file and I'm seeing no difference at all in the output page source code. I'm not getting the same error that Harald is seeing.
Rob, did you test on an HTTPS site?
Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
Ben wrote:Hi Harald, that's strange. Does this line:
if ($_SERVER["SERVER_PORT"] != "80" && strpos($host, ':') === false) {Now look exactly like this line?
if ($_SERVER["SERVER_PORT"] != "80" && strpos($host, ':') !== false) {Hi Ben,
Yes, it does. I tried it twice, only changing the "===" to "!==" instead of copying and pasting the line, and the result was the same.
Looking at the code, I find it strange too, but I don't want to fiddle too much with a live website.
Of course I don't have a certificate on my test subdomain or localhost, to I can't really test HTTPS in a test environment.Cheers
Harald
Ben,
With a but more testing, I think the function currentPageLocation() is correct after all.
In the line that creates the "canonical" meta-tag I think it should be function getcurrentPageURL() instead of getcurrentPageLocation(), but I'm looking further into it to understand why the sitemap doesn't cover anything within /galleries/.
Cheers
Harald
Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
rod barbee wrote:Hi Ben,
As another data point, I made the same change to my galleries.php file and I'm seeing no difference at all in the output page source code. I'm not getting the same error that Harald is seeing.
Rob, did you test on an HTTPS site?
No
Rod
Rod  
Just a user with way too much time on his hands.
www.rodbarbee.com
ttg-tips.com, Backlight 2/3 test site
Offline
Just had a look at the CE3 code:
<link rel="canonical" href="<?php echo currentPageLocation(); ?>/<?php echo currentPageName(); ?>" />which means something like
<link rel="canonical" href="http://www.website.com/index.php" />
CE4 uses
<link rel="canonical" href="<?php echo currentPageLocation(); ?>/" />which means something like
<link rel="canonical" href="http://www.website.com/" />
CE3 doesn't seem to care about HTTP or HTTPS, but sitemap building worked fine,
CE4 does care about HTTP and HTTPS, and sitemap building doesn't work.
That's not an analysis, just an observation. I still don't know where exactly the sitemap building goes wrong.
Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
In the line that creates the "canonical" meta-tag I think it should be function getcurrentPageURL() instead of getcurrentPageLocation(), but I'm looking further into it to understand why the sitemap doesn't cover anything within /galleries/.
Not so. Using getcurrentPageURL() would, for mobile galleries, return mobile.php; that's exactly what the canonical URLs are in place to prevent. For galleries, we want the mobile gallery pages to canonically be read as the main index.php, hence our use of getcurrentPageLocation().
Offline
HaraldJ wrote:In the line that creates the "canonical" meta-tag I think it should be function getcurrentPageURL() instead of getcurrentPageLocation(), but I'm looking further into it to understand why the sitemap doesn't cover anything within /galleries/.
Not so. Using getcurrentPageURL() would, for mobile galleries, return mobile.php; that's exactly what the canonical URLs are in place to prevent. For galleries, we want the mobile gallery pages to canonically be read as the main index.php, hence our use of getcurrentPageLocation().
Thanks, Matt!
I see your point, but do you have an idea why building a sitemap ignores the galleries then?
Cheers
Harald
Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
Sounds like the whole thing is misbehaving with Pages, as you have observed. It's possible that we might have unintentionally broken the function through the course of evolving the product, making codebase modifications for the sake of the publisher, etc. We probably made changes to the galleries, echoed those changes through all of the plugins for consistency in code, and neglected to account for this outcome. We will probably need to revisit the functions in CE4 Pages.
Offline
Sounds like the whole thing is misbehaving with Pages, as you have observed. It's possible that we might have unintentionally broken the function through the course of evolving the product, making codebase modifications for the sake of the publisher, etc. We probably made changes to the galleries, echoed those changes through all of the plugins for consistency in code, and neglected to account for this outcome. We will probably need to revisit the functions in CE4 Pages.
Today I have replaced, for all files on the website, the CE4 canonical tag code with the code from CE3:
CE4:
<link rel="canonical" href="<?php echo currentPageLocation(); ?>/" />CE3:
<link rel="canonical" href="<?php echo currentPageLocation(); ?>/<?php echo currentPageName(); ?>" />It was a test, and if anything goes wrong, I can go back to the original version.
But the sitemap building seems to go well now, and I tried the galleries on mobile devices, where they seem to be working without any problems.
I also added some experimental code to deal with the SEO pagination problem, search engines will see duplicate content when they go through galleries with more than one page (see here for the technical background).
Thanks to the excellent PHP code in CE4, it seemed quite simple to add two additional meta-tags, "prev" and "next". This example is from page 5 of a gallery:
<link rel="canonical" href="https://www.haraldjoergens.com:443/galleries/rowing/the-boat-race/2016-01-31-cuwbc-vs-obubc/index.php" />
<link rel="prev" href="https://www.haraldjoergens.com:443/galleries/rowing/the-boat-race/2016-01-31-cuwbc-vs-obubc/index.php?page=4" />
<link rel="next" href="https://www.haraldjoergens.com:443/galleries/rowing/the-boat-race/2016-01-31-cuwbc-vs-obubc/index.php?page=6" />I'll find out what the web crawlers will make of it!
Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
That's really interesting. Thank you for sharing and including the background infos! Please keep us posted on the success of crawlers indexing your paged galleries.
Best
Daniel Leu | Photography   
DanielLeu.com
My digital playground (eg, Backlight tips&tricks): lab.DanielLeu.com
Offline
Talking about sitemaps:
If you are using a software or a service to create your sitemaps, it might be a good idea to exclude any links that contain
-single.php
thumbnails
sender.a.getAttribute('data-gps')
http://maps.google.comIf you don't exclude *-single.php, every photo will be seen as a page.
And if you don't exclude *thumbnails*, your image sitemap will include three entries for every photo, one from /photos, one from /thumbnails, and one from /thumbnails-for-mobile.
The sender.a.getAttribute('data-gps') makes sense if you are using map information in your galleries, without this exclusion parameter you might get a large number of broken links.
And the Google Maps parameter stops every single photo with map information from being added as an extra page with Google Map information.
Harald Joergens ARPS
Harald Joergens Photography
Nutfield, Surrey, UK
Offline
Pages: 1