diff --git a/api/README.md b/api/README.md index 84c534e..70d85de 100644 --- a/api/README.md +++ b/api/README.md @@ -11,12 +11,9 @@ we recommend [deploying your own instance](/docs/run-an-instance.md) if you wish you can read [the api documentation here](/docs/api.md). -> [!WARNING] -> the v7 public api (/api/json) will be shut down on **november 11th, 2024**. -> you can access documentation for it [here](https://github.com/imputnet/cobalt/blob/7/docs/api.md). - ## supported services -this list is not final and keeps expanding over time. if support for a service you want is missing, create an issue (or a pull request 👀). +this list is not final and keeps expanding over time! +if the desired service isn't supported yet, feel free to create an appropriate issue (or a pull request 👀). | service | video + audio | only audio | only video | metadata | rich file names | | :-------- | :-----------: | :--------: | :--------: | :------: | :-------------: | @@ -39,12 +36,13 @@ this list is not final and keeps expanding over time. if support for a service y | twitter/x | ✅ | ✅ | ✅ | ➖ | ➖ | | vimeo | ✅ | ✅ | ✅ | ✅ | ✅ | | vk videos & clips | ✅ | ❌ | ✅ | ✅ | ✅ | +| xiaohongshu | ✅ | ✅ | ✅ | ➖ | ➖ | | youtube | ✅ | ✅ | ✅ | ✅ | ✅ | | emoji | meaning | | :-----: | :---------------------- | | ✅ | supported | -| ➖ | impossible/unreasonable | +| ➖ | unreasonable/impossible | | ❌ | not supported | ### additional notes or features (per service) @@ -71,36 +69,35 @@ as long as you: - provide a link to the license and indicate if changes to the code were made, and - release the code under the **same license** -## acknowledgements +## open source acknowledgements ### ffmpeg -cobalt heavily relies on ffmpeg for converting and merging media files. it's an absolutely amazing piece of software offered for anyone for free, yet doesn't receive as much credit as it should. +cobalt relies on ffmpeg for muxing and encoding media files. ffmpeg is absolutely spectacular and we're privileged to have an ability to use it for free, just like anyone else. we believe it should be way more recognized. you can [support ffmpeg here](https://ffmpeg.org/donations.html)! -#### ffmpeg-static -we use [ffmpeg-static](https://github.com/eugeneware/ffmpeg-static) to get binaries for ffmpeg depending on the platform. - -you can support the developer via various methods listed on their github page! (linked above) - ### youtube.js -cobalt relies on [youtube.js](https://github.com/LuanRT/YouTube.js) for interacting with the innertube api, it wouldn't have been possible without it. +cobalt relies on **[youtube.js](https://github.com/LuanRT/YouTube.js)** for interacting with youtube's innertube api, it wouldn't have been possible without this package. -you can support the developer via various methods listed on their github page! (linked above) +you can support the developer via various methods listed on their github page! +(linked above) ### many others -cobalt also depends on: - -- [content-disposition-header](https://www.npmjs.com/package/content-disposition-header) to simplify the provision of `content-disposition` headers. -- [cors](https://www.npmjs.com/package/cors) to manage cross-origin resource sharing within expressjs. -- [dotenv](https://www.npmjs.com/package/dotenv) to load environment variables from the `.env` file. -- [express](https://www.npmjs.com/package/express) as the backbone of cobalt servers. -- [express-rate-limit](https://www.npmjs.com/package/express-rate-limit) to rate limit api endpoints. -- [hls-parser](https://www.npmjs.com/package/hls-parser) to parse `m3u8` playlists for certain services. -- [ipaddr.js](https://www.npmjs.com/package/ipaddr.js) to parse ip addresses (for rate limiting). -- [nanoid](https://www.npmjs.com/package/nanoid) to generate unique (temporary) identifiers for each requested stream. -- [psl](https://www.npmjs.com/package/psl) as the domain name parser. -- [set-cookie-parser](https://www.npmjs.com/package/set-cookie-parser) to parse cookies that cobalt receives from certain services. -- [undici](https://www.npmjs.com/package/undici) for making http requests. -- [url-pattern](https://www.npmjs.com/package/url-pattern) to match provided links with supported patterns. +cobalt-api also depends on: + +- **[content-disposition-header](https://www.npmjs.com/package/content-disposition-header)** to simplify the provision of `content-disposition` headers. +- **[cors](https://www.npmjs.com/package/cors)** to manage cross-origin resource sharing within expressjs. +- **[dotenv](https://www.npmjs.com/package/dotenv)** to load environment variables from the `.env` file. +- **[express](https://www.npmjs.com/package/express)** as the backbone of cobalt servers. +- **[express-rate-limit](https://www.npmjs.com/package/express-rate-limit)** to rate limit api endpoints. +- **[ffmpeg-static](https://www.npmjs.com/package/ffmpeg-static)** to get binaries for ffmpeg depending on the platform. +- **[hls-parser](https://www.npmjs.com/package/hls-parser)** to parse HLS playlists according to spec (very impressive stuff). +- **[ipaddr.js](https://www.npmjs.com/package/ipaddr.js)** to parse ip addresses (used for rate limiting). +- **[nanoid](https://www.npmjs.com/package/nanoid)** to generate unique identifiers for each requested tunnel. +- **[set-cookie-parser](https://www.npmjs.com/package/set-cookie-parser)** to parse cookies that cobalt receives from certain services. +- **[undici](https://www.npmjs.com/package/undici)** for making http requests. +- **[url-pattern](https://www.npmjs.com/package/url-pattern)** to match provided links with supported patterns. +- **[zod](https://www.npmjs.com/package/zod)** to lock down the api request schema. +- **[@datastructures-js/priority-queue](https://www.npmjs.com/package/@datastructures-js/priority-queue)** for sorting stream caches for future clean up (without redis). +- **[@imput/psl](https://www.npmjs.com/package/@imput/psl)** as the domain name parser, our fork of [psl](https://www.npmjs.com/package/psl). ...and many other packages that these packages rely on. diff --git a/api/package.json b/api/package.json index 829106c..fc5304e 100644 --- a/api/package.json +++ b/api/package.json @@ -1,7 +1,7 @@ { "name": "@imput/cobalt-api", "description": "save what you love", - "version": "10.6", + "version": "10.7.7", "author": "imput", "exports": "./src/cobalt.js", "type": "module", @@ -39,7 +39,7 @@ "set-cookie-parser": "2.6.0", "undici": "^5.19.1", "url-pattern": "1.0.3", - "youtubei.js": "^13.0.0", + "youtubei.js": "^13.1.0", "zod": "^3.23.8" }, "optionalDependencies": { diff --git a/api/src/index.js b/api/src/index.js deleted file mode 100644 index 41fc750..0000000 --- a/api/src/index.js +++ /dev/null @@ -1,32 +0,0 @@ -import 'dotenv/config'; - -import express from 'express'; -import cluster from 'node:cluster'; - -import path from 'path'; -import { fileURLToPath } from 'url'; - -import { env, isCluster } from './config.js'; -import { Red } from './misc/console-text.js'; -import { initCluster } from './misc/cluster.js'; - -const app = express(); - -const __filename = fileURLToPath(import.meta.url); -const __dirname = path.dirname(__filename).slice(0, -4); - -app.disable('x-powered-by'); - -if (env.apiURL) { - const { runAPI } = await import('./core/api.js'); - - if (isCluster) { - await initCluster(); - } - - runAPI(express, app, __dirname, cluster.isPrimary); -} else { - console.log( - Red("API_URL env variable is missing, Team Hydra Video Downloader api can't start."), - ); -} diff --git a/api/src/misc/run-test.js b/api/src/misc/run-test.js index 21d97d0..6dd0818 100644 --- a/api/src/misc/run-test.js +++ b/api/src/misc/run-test.js @@ -23,6 +23,15 @@ export async function runTest(url, params, expect) { if (expect.status !== result.body.status) { const detail = `${expect.status} (expected) != ${result.body.status} (actual)`; error.push(`status mismatch: ${detail}`); + + if (result.body.status === 'error') { + error.push(`error code: ${result.body?.error?.code}`); + } + } + + if (expect.errorCode && expect.errorCode !== result.body?.error?.code) { + const detail = `${expect.errorCode} (expected) != ${result.body.error.code} (actual)` + error.push(`error mismatch: ${detail}`); } if (expect.code !== result.status) { diff --git a/api/src/misc/utils.js b/api/src/misc/utils.js index 331528d..76d7a3e 100644 --- a/api/src/misc/utils.js +++ b/api/src/misc/utils.js @@ -1,12 +1,14 @@ +import { request } from 'undici'; const redirectStatuses = new Set([301, 302, 303, 307, 308]); -export async function getRedirectingURL(url, dispatcher) { - const location = await fetch(url, { - redirect: 'manual', +export async function getRedirectingURL(url, dispatcher, userAgent) { + const location = await request(url, { dispatcher, - }).then((r) => { - if (redirectStatuses.has(r.status) && r.headers.has('location')) { - return r.headers.get('location'); + method: 'HEAD', + headers: { 'user-agent': userAgent } + }).then(r => { + if (redirectStatuses.has(r.statusCode) && r.headers['location']) { + return r.headers['location']; } }).catch(() => null); diff --git a/api/src/processing/match.js b/api/src/processing/match.js index 9fabf37..e2d6aa0 100644 --- a/api/src/processing/match.js +++ b/api/src/processing/match.js @@ -120,9 +120,8 @@ export default async function({ host, patternMatch, params }) { case "reddit": r = await reddit({ - sub: patternMatch.sub, - id: patternMatch.id, - user: patternMatch.user + ...patternMatch, + dispatcher, }); break; @@ -228,7 +227,8 @@ export default async function({ host, patternMatch, params }) { case "facebook": r = await facebook({ - ...patternMatch + ...patternMatch, + dispatcher }); break; diff --git a/api/src/processing/service-config.js b/api/src/processing/service-config.js index 86352f9..1dc8bf3 100644 --- a/api/src/processing/service-config.js +++ b/api/src/processing/service-config.js @@ -35,13 +35,25 @@ export const services = { }, instagram: { patterns: [ - "reels/:postId", - ":username/reel/:postId", - "reel/:postId", "p/:postId", - ":username/p/:postId", "tv/:postId", - "stories/:username/:storyId" + "reel/:postId", + "reels/:postId", + "stories/:username/:storyId", + + /* + share & username links use the same url pattern, + so we test the share pattern first, cuz id type is different. + however, if someone has the "share" username and the user + somehow gets a link of this ancient style, it's joever. + */ + + "share/:shareId", + "share/p/:shareId", + "share/reel/:shareId", + + ":username/p/:postId", + ":username/reel/:postId", ], altDomains: ["ddinstagram.com"], }, @@ -64,8 +76,21 @@ export const services = { }, reddit: { patterns: [ + "comments/:id", + + "r/:sub/comments/:id", "r/:sub/comments/:id/:title", - "user/:user/comments/:id/:title" + "r/:sub/comments/:id/comment/:commentId", + + "user/:user/comments/:id", + "user/:user/comments/:id/:title", + "user/:user/comments/:id/comment/:commentId", + + "r/u_:user/comments/:id", + "r/u_:user/comments/:id/:title", + "r/u_:user/comments/:id/comment/:commentId", + + "r/:sub/s/:shareId" ], subdomains: "*", }, diff --git a/api/src/processing/service-patterns.js b/api/src/processing/service-patterns.js index 42f64d2..8735f12 100644 --- a/api/src/processing/service-patterns.js +++ b/api/src/processing/service-patterns.js @@ -6,7 +6,8 @@ export const testers = { "dailymotion": pattern => pattern.id?.length <= 32, "instagram": pattern => - pattern.postId?.length <= 12 + pattern.postId?.length <= 48 + || pattern.shareId?.length <= 16 || (pattern.username?.length <= 30 && pattern.storyId?.length <= 24), "loom": pattern => @@ -19,8 +20,10 @@ export const testers = { pattern.id?.length <= 128 || pattern.shortLink?.length <= 32, "reddit": pattern => - (pattern.sub?.length <= 22 && pattern.id?.length <= 10) - || (pattern.user?.length <= 22 && pattern.id?.length <= 10), + pattern.id?.length <= 16 && !pattern.sub && !pattern.user + || (pattern.sub?.length <= 22 && pattern.id?.length <= 16) + || (pattern.user?.length <= 22 && pattern.id?.length <= 16) + || (pattern.sub?.length <= 22 && pattern.shareId?.length <= 16), "rutube": pattern => (pattern.id?.length === 32 && pattern.key?.length <= 32) || diff --git a/api/src/processing/services/bilibili.js b/api/src/processing/services/bilibili.js index b47b0bc..4ee148d 100644 --- a/api/src/processing/services/bilibili.js +++ b/api/src/processing/services/bilibili.js @@ -1,19 +1,8 @@ import { genericUserAgent, env } from "../../config.js"; +import { resolveRedirectingURL } from "../url.js"; // TO-DO: higher quality downloads (currently requires an account) -function com_resolveShortlink(shortId) { - return fetch(`https://b23.tv/${shortId}`, { redirect: 'manual' }) - .then(r => r.status > 300 && r.status < 400 && r.headers.get('location')) - .then(url => { - if (!url) return; - const path = new URL(url).pathname; - if (path.startsWith('/video/')) - return path.split('/')[2]; - }) - .catch(() => {}) -} - function getBest(content) { return content?.filter(v => v.baseUrl || v.url) .map(v => (v.baseUrl = v.baseUrl || v.url, v)) @@ -99,7 +88,8 @@ async function tv_download(id) { export default async function({ comId, tvId, comShortLink }) { if (comShortLink) { - comId = await com_resolveShortlink(comShortLink); + const patternMatch = await resolveRedirectingURL(`https://b23.tv/${comShortLink}`); + comId = patternMatch?.comId; } if (comId) { diff --git a/api/src/processing/services/facebook.js b/api/src/processing/services/facebook.js index 7bfd475..9e9d060 100644 --- a/api/src/processing/services/facebook.js +++ b/api/src/processing/services/facebook.js @@ -8,8 +8,8 @@ const headers = { 'Sec-Fetch-Site': 'none', } -const resolveUrl = (url) => { - return fetch(url, { headers }) +const resolveUrl = (url, dispatcher) => { + return fetch(url, { headers, dispatcher }) .then(r => { if (r.headers.get('location')) { return decodeURIComponent(r.headers.get('location')); @@ -23,13 +23,13 @@ const resolveUrl = (url) => { .catch(() => false); } -export default async function({ id, shareType, shortLink }) { +export default async function({ id, shareType, shortLink, dispatcher }) { let url = `https://web.facebook.com/i/videos/${id}`; if (shareType) url = `https://web.facebook.com/share/${shareType}/${id}`; - if (shortLink) url = await resolveUrl(`https://fb.watch/${shortLink}`); + if (shortLink) url = await resolveUrl(`https://fb.watch/${shortLink}`, dispatcher); - const html = await fetch(url, { headers }) + const html = await fetch(url, { headers, dispatcher }) .then(r => r.text()) .catch(() => false); diff --git a/api/src/processing/services/instagram.js b/api/src/processing/services/instagram.js index d9a646a..9cc7dbd 100644 --- a/api/src/processing/services/instagram.js +++ b/api/src/processing/services/instagram.js @@ -1,3 +1,5 @@ +import { randomBytes } from "node:crypto"; +import { resolveRedirectingURL } from "../url.js"; import { genericUserAgent } from "../../config.js"; import { createStream } from "../../stream/manage.js"; import { getCookie, updateCookie } from "../cookie/manager.js"; @@ -8,6 +10,7 @@ const commonHeaders = { "sec-fetch-site": "same-origin", "x-ig-app-id": "936619743392459" } + const mobileHeaders = { "x-ig-app-locale": "en_US", "x-ig-device-locale": "en_US", @@ -19,6 +22,7 @@ const mobileHeaders = { "x-fb-server-cluster": "True", "content-length": "0", } + const embedHeaders = { "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7", "Accept-Language": "en-GB,en;q=0.9", @@ -33,7 +37,7 @@ const embedHeaders = { "Sec-Fetch-Site": "none", "Sec-Fetch-User": "?1", "Upgrade-Insecure-Requests": "1", - "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36", + "User-Agent": genericUserAgent, } const cachedDtsg = { @@ -41,7 +45,17 @@ const cachedDtsg = { expiry: 0 } -export default function(obj) { +const getNumberFromQuery = (name, data) => { + const s = data?.match(new RegExp(name + '=(\\d+)'))?.[1]; + if (+s) return +s; +} + +const getObjectFromEntries = (name, data) => { + const obj = data?.match(new RegExp('\\["' + name + '",.*?,({.*?}),\\d+\\]'))?.[1]; + return obj && JSON.parse(obj); +} + +export default function instagram(obj) { const dispatcher = obj.dispatcher; async function findDtsgId(cookie) { @@ -91,6 +105,7 @@ export default function(obj) { updateCookie(cookie, data.headers); return data.json(); } + async function getMediaId(id, { cookie, token } = {}) { const oembedURL = new URL('https://i.instagram.com/api/v1/oembed/'); oembedURL.searchParams.set('url', `https://www.instagram.com/p/${id}/`); @@ -119,6 +134,7 @@ export default function(obj) { return mediaInfo?.items?.[0]; } + async function requestHTML(id, cookie) { const data = await fetch(`https://www.instagram.com/p/${id}/embed/captioned/`, { headers: { @@ -136,40 +152,167 @@ export default function(obj) { return embedData; } + + async function getGQLParams(id, cookie) { + const req = await fetch(`https://www.instagram.com/p/${id}/`, { + headers: { + ...embedHeaders, + cookie + }, + dispatcher + }); + + const html = await req.text(); + const siteData = getObjectFromEntries('SiteData', html); + const polarisSiteData = getObjectFromEntries('PolarisSiteData', html); + const webConfig = getObjectFromEntries('DGWWebConfig', html); + const pushInfo = getObjectFromEntries('InstagramWebPushInfo', html); + const lsd = getObjectFromEntries('LSD', html)?.token || randomBytes(8).toString('base64url'); + const csrf = getObjectFromEntries('InstagramSecurityConfig', html)?.csrf_token; + + const anon_cookie = [ + csrf && "csrftoken=" + csrf, + polarisSiteData?.device_id && "ig_did=" + polarisSiteData?.device_id, + "wd=1280x720", + "dpr=2", + polarisSiteData?.machine_id && "mid=" + polarisSiteData.machine_id, + "ig_nrcb=1" + ].filter(a => a).join('; '); + + return { + headers: { + 'x-ig-app-id': webConfig?.appId || '936619743392459', + 'X-FB-LSD': lsd, + 'X-CSRFToken': csrf, + 'X-Bloks-Version-Id': getObjectFromEntries('WebBloksVersioningID', html)?.versioningID, + 'x-asbd-id': 129477, + cookie: anon_cookie + }, + body: { + __d: 'www', + __a: '1', + __s: '::' + Math.random().toString(36).substring(2).replace(/\d/g, '').slice(0, 6), + __hs: siteData?.haste_session || '20126.HYP:instagram_web_pkg.2.1...0', + __req: 'b', + __ccg: 'EXCELLENT', + __rev: pushInfo?.rollout_hash || '1019933358', + __hsi: siteData?.hsi || '7436540909012459023', + __dyn: randomBytes(154).toString('base64url'), + __csr: randomBytes(154).toString('base64url'), + __user: '0', + __comet_req: getNumberFromQuery('__comet_req', html) || '7', + av: '0', + dpr: '2', + lsd, + jazoest: getNumberFromQuery('jazoest', html) || Math.floor(Math.random() * 10000), + __spin_r: siteData?.__spin_r || '1019933358', + __spin_b: siteData?.__spin_b || 'trunk', + __spin_t: siteData?.__spin_t || Math.floor(new Date().getTime() / 1000), + } + }; + } + async function requestGQL(id, cookie) { - let dtsgId; + const { headers, body } = await getGQLParams(id, cookie); - if (cookie) { - dtsgId = await findDtsgId(cookie); - } - const url = new URL('https://www.instagram.com/api/graphql/'); + const req = await fetch('https://www.instagram.com/graphql/query', { + method: 'POST', + dispatcher, + headers: { + ...embedHeaders, + ...headers, + cookie, + 'content-type': 'application/x-www-form-urlencoded', + 'X-FB-Friendly-Name': 'PolarisPostActionLoadPostQueryQuery', + }, + body: new URLSearchParams({ + ...body, + fb_api_caller_class: 'RelayModern', + fb_api_req_friendly_name: 'PolarisPostActionLoadPostQueryQuery', + variables: JSON.stringify({ + shortcode: id, + fetch_tagged_user_count: null, + hoisted_comment_id: null, + hoisted_reply_id: null + }), + server_timestamps: true, + doc_id: '8845758582119845' + }).toString() + }); - const requestData = { - jazoest: '26406', - variables: JSON.stringify({ - shortcode: id, - __relay_internal__pv__PolarisShareMenurelayprovider: false - }), - doc_id: '7153618348081770' + return { + gql_data: await req.json() + .then(r => r.data) + .catch(() => null) }; - if (dtsgId) { - requestData.fb_dtsg = dtsgId; + } + + async function getErrorContext(id) { + try { + const { headers, body } = await getGQLParams(id); + + const req = await fetch('https://www.instagram.com/ajax/bulk-route-definitions/', { + method: 'POST', + dispatcher, + headers: { + ...embedHeaders, + ...headers, + 'content-type': 'application/x-www-form-urlencoded', + 'X-Ig-D': 'www', + }, + body: new URLSearchParams({ + 'route_urls[0]': `/p/${id}/`, + routing_namespace: 'igx_www', + ...body + }).toString() + }); + + const response = await req.text(); + if (response.includes('"tracePolicy":"polaris.privatePostPage"')) + return { error: 'content.post.private' }; + + const [, mediaId, mediaOwnerId] = response.match( + /"media_id":\s*?"(\d+)","media_owner_id":\s*?"(\d+)"/ + ) || []; + + if (mediaId && mediaOwnerId) { + const rulingURL = new URL('https://www.instagram.com/api/v1/web/get_ruling_for_media_content_logged_out'); + rulingURL.searchParams.set('media_id', mediaId); + rulingURL.searchParams.set('owner_id', mediaOwnerId); + + const rulingResponse = await fetch(rulingURL, { + headers: { + ...headers, + ...commonHeaders + }, + dispatcher, + }).then(a => a.json()).catch(() => ({})); + + if (rulingResponse?.title?.includes('Restricted')) + return { error: "content.post.age" }; + } + } catch { + return { error: "fetch.fail" }; } - return (await request(url, cookie, 'POST', requestData)) - .data - ?.xdt_api__v1__media__shortcode__web_info - ?.items - ?.[0]; + return { error: "fetch.empty" }; } function extractOldPost(data, id, alwaysProxy) { - const sidecar = data?.gql_data?.shortcode_media?.edge_sidecar_to_children; + const shortcodeMedia = data?.gql_data?.shortcode_media || data?.gql_data?.xdt_shortcode_media; + const sidecar = shortcodeMedia?.edge_sidecar_to_children; + if (sidecar) { const picker = sidecar.edges.filter(e => e.node?.display_url) .map((e, i) => { - const type = e.node?.is_video ? "video" : "photo"; - const url = type === "video" ? e.node?.video_url : e.node?.display_url; + const type = e.node?.is_video && e.node?.video_url ? "video" : "photo"; + + let url; + if (type === "video") { + url = e.node?.video_url; + } else if (type === "photo") { + url = e.node?.display_url; + } let itemExt = type === "video" ? "mp4" : "jpg"; @@ -196,16 +339,21 @@ export default function(obj) { }); if (picker.length) return { picker } - } else if (data?.gql_data?.shortcode_media?.video_url) { + } + + if (shortcodeMedia?.video_url) { return { - urls: data.gql_data.shortcode_media.video_url, + urls: shortcodeMedia.video_url, filename: `instagram_${id}.mp4`, audioFilename: `instagram_${id}_audio` } - } else if (data?.gql_data?.shortcode_media?.display_url) { + } + + if (shortcodeMedia?.display_url) { return { - urls: data.gql_data?.shortcode_media.display_url, - isPhoto: true + urls: shortcodeMedia.display_url, + isPhoto: true, + filename: `instagram_${id}.jpg`, } } } @@ -266,7 +414,9 @@ export default function(obj) { } async function getPost(id, alwaysProxy) { - const hasData = (data) => data && data.gql_data !== null; + const hasData = (data) => data + && data.gql_data !== null + && data?.gql_data?.xdt_shortcode_media !== null; let data, result; try { const cookie = getCookie('instagram'); @@ -295,7 +445,9 @@ export default function(obj) { if (!hasData(data) && cookie) data = await requestGQL(id, cookie); } catch {} - if (!data) return { error: "fetch.fail" }; + if (!hasData(data)) { + return getErrorContext(id); + } if (data?.gql_data) { result = extractOldPost(data, id, alwaysProxy) @@ -358,14 +510,30 @@ export default function(obj) { if (item.image_versions2?.candidates) { return { urls: item.image_versions2.candidates[0].url, - isPhoto: true + isPhoto: true, + filename: `instagram_${id}.jpg`, } } return { error: "link.unsupported" }; } - const { postId, storyId, username, alwaysProxy } = obj; + const { postId, shareId, storyId, username, alwaysProxy } = obj; + + if (shareId) { + return resolveRedirectingURL( + `https://www.instagram.com/share/${shareId}/`, + dispatcher, + // for some reason instagram decides to return HTML + // instead of a redirect when requesting with a normal + // browser user-agent + 'curl/7.88.1' + ).then(match => instagram({ + ...obj, ...match, + shareId: undefined + })); + } + if (postId) return getPost(postId, alwaysProxy); if (username && storyId) return getStory(username, storyId); diff --git a/api/src/processing/services/ok.js b/api/src/processing/services/ok.js index 10fb785..cfe18e4 100644 --- a/api/src/processing/services/ok.js +++ b/api/src/processing/services/ok.js @@ -44,7 +44,7 @@ export default async function(o) { let fileMetadata = { title: videoData.movie.title.trim(), - author: (videoData.author?.name || videoData.compilationTitle).trim(), + author: (videoData.author?.name || videoData.compilationTitle)?.trim(), } if (bestVideo) return { diff --git a/api/src/processing/services/pinterest.js b/api/src/processing/services/pinterest.js index 9c0ac9c..ea4275c 100644 --- a/api/src/processing/services/pinterest.js +++ b/api/src/processing/services/pinterest.js @@ -1,4 +1,5 @@ import { genericUserAgent } from "../../config.js"; +import { resolveRedirectingURL } from "../url.js"; const videoRegex = /"url":"(https:\/\/v1\.pinimg\.com\/videos\/.*?)"/g; const imageRegex = /src="(https:\/\/i\.pinimg\.com\/.*\.(jpg|gif))"/g; @@ -7,10 +8,10 @@ export default async function(o) { let id = o.id; if (!o.id && o.shortLink) { - id = await fetch(`https://api.pinterest.com/url_shortener/${o.shortLink}/redirect/`, { redirect: "manual" }) - .then(r => r.headers.get("location").split('pin/')[1].split('/')[0]) - .catch(() => {}); + const patternMatch = await resolveRedirectingURL(`https://api.pinterest.com/url_shortener/${o.shortLink}/redirect/`); + id = patternMatch?.id; } + if (id.includes("--")) id = id.split("--")[1]; if (!id) return { error: "fetch.fail" }; @@ -26,8 +27,8 @@ export default async function(o) { if (videoLink) return { urls: videoLink, - filename: `pinterest_${o.id}.mp4`, - audioFilename: `pinterest_${o.id}_audio` + filename: `pinterest_${id}.mp4`, + audioFilename: `pinterest_${id}_audio` } const imageLink = [...html.matchAll(imageRegex)] @@ -39,7 +40,7 @@ export default async function(o) { if (imageLink) return { urls: imageLink, isPhoto: true, - filename: `pinterest_${o.id}.${imageType}` + filename: `pinterest_${id}.${imageType}` } return { error: "fetch.empty" }; diff --git a/api/src/processing/services/reddit.js b/api/src/processing/services/reddit.js index 701db23..50c78d3 100644 --- a/api/src/processing/services/reddit.js +++ b/api/src/processing/services/reddit.js @@ -1,3 +1,4 @@ +import { resolveRedirectingURL } from "../url.js"; import { genericUserAgent, env } from "../../config.js"; import { getCookie, updateCookieValues } from "../cookie/manager.js"; @@ -48,12 +49,20 @@ async function getAccessToken() { } export default async function(obj) { - let url = new URL(`https://www.reddit.com/r/${obj.sub}/comments/${obj.id}.json`); - - if (obj.user) { - url.pathname = `/user/${obj.user}/comments/${obj.id}.json`; + let params = obj; + + if (!params.id && params.shareId) { + params = await resolveRedirectingURL( + `https://www.reddit.com/r/${params.sub}/s/${params.shareId}`, + obj.dispatcher, + genericUserAgent + ); } + if (!params?.id) return { error: "fetch.short_link" }; + + const url = new URL(`https://www.reddit.com/comments/${params.id}.json`); + const accessToken = await getAccessToken(); if (accessToken) url.hostname = 'oauth.reddit.com'; @@ -73,12 +82,17 @@ export default async function(obj) { data = data[0]?.data?.children[0]?.data; - const id = `${String(obj.sub).toLowerCase()}_${obj.id}`; + let sourceId; + if (params.sub || params.user) { + sourceId = `${String(params.sub || params.user).toLowerCase()}_${params.id}`; + } else { + sourceId = params.id; + } if (data?.url?.endsWith('.gif')) return { typeId: "redirect", urls: data.url, - filename: `reddit_${id}.gif`, + filename: `reddit_${sourceId}.gif`, } if (!data.secure_media?.reddit_video) @@ -87,8 +101,9 @@ export default async function(obj) { if (data.secure_media?.reddit_video?.duration > env.durationLimit) return { error: "content.too_long" }; + const video = data.secure_media?.reddit_video?.fallback_url?.split('?')[0]; + let audio = false, - video = data.secure_media?.reddit_video?.fallback_url?.split('?')[0], audioFileLink = `${data.secure_media?.reddit_video?.fallback_url?.split('DASH')[0]}audio`; if (video.match('.mp4')) { @@ -121,7 +136,7 @@ export default async function(obj) { typeId: "tunnel", type: "merge", urls: [video, audioFileLink], - audioFilename: `reddit_${id}_audio`, - filename: `reddit_${id}.mp4` + audioFilename: `reddit_${sourceId}_audio`, + filename: `reddit_${sourceId}.mp4` } } diff --git a/api/src/processing/services/snapchat.js b/api/src/processing/services/snapchat.js index 4c62a5f..f5d6613 100644 --- a/api/src/processing/services/snapchat.js +++ b/api/src/processing/services/snapchat.js @@ -1,7 +1,6 @@ -import { extract, normalizeURL } from "../url.js"; +import { resolveRedirectingURL } from "../url.js"; import { genericUserAgent } from "../../config.js"; import { createStream } from "../../stream/manage.js"; -import { getRedirectingURL } from "../../misc/utils.js"; const SPOTLIGHT_VIDEO_REGEX = //; const NEXT_DATA_REGEX = /